Introduction
TTP Agent SDK is a powerful JavaScript library for building AI-powered voice and chat interactions in your web applications.
What We Offer
Key Features
Direct connection using agentId and appId with domain whitelist access control
Colors, branding, languages, RTL support, and custom agent settings
Works seamlessly on desktop, tablet, and mobile devices
Built-in support for multiple languages and custom translations
WebSocket-based audio streaming with low latency
Simple CDN setup or NPM package with comprehensive API
Installation
NPM
npm install ttp-agent-sdk
CDN
<script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>
Import
// ES6 Import
import { VoiceSDK } from 'ttp-agent-sdk';
// CommonJS
const { VoiceSDK } = require('ttp-agent-sdk');
// Browser Global
const sdk = new window.TTPAgentSDK.VoiceSDK(config);
Quick Start
Get up and running in 5 minutes with this simple example.
Initialize the SDK
Create a VoiceSDK instance with your agent ID and app ID:
import { VoiceSDK } from 'ttp-agent-sdk';
const voiceSDK = new VoiceSDK({
agentId: 'agent_123', // The AI agent to connect to
appId: 'your_app_id', // Your application ID
// Optional: Configure audio formats (v2 protocol)
outputContainer: 'raw', // 'raw' or 'wav'
outputEncoding: 'pcm', // 'pcm', 'pcmu', 'pcma'
outputSampleRate: 24000, // Typical server/TTS output (default)
protocolVersion: 2 // Use v2 protocol for format negotiation
});
// Listen to events
voiceSDK.on('connected', () => {
console.log('✅ Connected to agent');
});
voiceSDK.on('formatNegotiated', (format) => {
console.log('✅ Format negotiated:', format);
// Format contains: container, encoding, sampleRate, channels, bitDepth
});
voiceSDK.on('message', (msg) => {
if (msg.t === 'agent_response') {
console.log('Agent:', msg.agent_response);
}
});
Connect & Start Recording
Connect to the agent and start capturing audio. If your agent uses a server-driven disclaimer, listen for disclaimersRequired, call sendDisclaimerAck(true) after the user accepts, then call startRecording().
// Connect
await voiceSDK.connect();
// Start recording
await voiceSDK.startRecording();
// Stop recording
await voiceSDK.stopRecording();
Authentication
The SDK connects directly using agentId and appId. No server-side authentication step is needed. Access control is managed via domain whitelist in your agent's admin panel.
How It Works
agentId and appId in the SDK configuration. The SDK connects directly to the TTP backend via WebSocket.
const voiceSDK = new VoiceSDK({
agentId: 'agent_123', // Your AI agent ID
appId: 'your_app_id' // Your application ID
});
Domain Whitelist
To control which websites can use your agent, configure a domain whitelist in your agent's admin panel. Only requests originating from whitelisted domains will be accepted.
localhost to the domain whitelist. Remove it before deploying to production.
Configuration Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
agentId |
string | Yes | The AI agent identifier |
appId |
string | Yes | Your application identifier |
Agent Settings Override
NEW FEATURE
Dynamically customize agent behavior, voice, and personality on a per-session basis.
How It Works
- Configure: Set up a domain whitelist for your agent in the admin panel
- Initialize: Pass
agentSettingsOverridein the SDK configuration - Connect: The SDK sends overrides in the hello message
- TTP Backend: Validates the domain and applies your overrides
Example
const voiceSDK = new VoiceSDK({
agentId: 'agent_123', // The AI agent to connect to
appId: 'your_app_id', // Your application ID
// Override agent settings
agentSettingsOverride: {
// Core settings
prompt: "You are a friendly Spanish-speaking travel assistant",
language: "es",
temperature: 0.9,
maxTokens: 200,
// Voice settings
voiceSpeed: 1.2,
voiceId: "nova", // Use voiceId (not selectedVoice)
// Behavior
firstMessage: "¡Hola! ¿Cómo puedo ayudarte hoy?",
disableInterruptions: false,
autoDetectLanguage: true,
// Tools (optional)
toolIds: [123, 456, 789], // Custom tool IDs
internalToolIds: ['calendar', 'email'] // Internal tool IDs
}
});
Available Override Settings
15 out of 16 settings can be overridden. Only model selection is not supported (requires infrastructure changes).
📝 Core Settings
prompt- System prompt/instructionstemperature- LLM temperature (0-2)maxTokens- Maximum response tokensmodel- ⚠️ NOT SUPPORTEDlanguage- Response language code
🔊 Voice Settings
voiceId- Specific voice IDvoiceSpeed- Speed multiplier (0.5-2)
⚙️ Behavior
firstMessage- Initial greetingdisableInterruptions- Allow/prevent barge-inautoDetectLanguage- Auto language detectioncandidateLanguages- List of languages for auto-detectionmaxCallDuration- Max session duration (seconds)
🛠️ Advanced
toolIds- Array of custom tool IDsinternalToolIds- Array of internal tool IDstimezone- User timezone
Variables in Hello Request
Overview
Variables allow you to pass dynamic values to your agent that will be used to replace placeholders in the system prompt and first message. Variables sent in the hello request take precedence over default variables stored in the agent configuration.
Hello Message Format
SDK v2 Format (Recommended)
When using VoiceSDK v2, variables are passed in the SDK constructor:
import { VoiceSDK_v2 } from 'ttp-agent-sdk';
const voiceSDK = new VoiceSDK_v2({
agentId: 'agent_5a2b984c1',
appId: 'app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC',
// Variables (optional)
variables: {
USER_NAME: 'John',
ACCOUNT_TYPE: 'premium',
LANGUAGE: 'en-US'
},
// Audio format configuration
sampleRate: 16000,
channels: 1,
bitDepth: 16,
outputContainer: 'raw',
outputEncoding: 'pcm',
outputSampleRate: 24000,
outputChannels: 1,
outputBitDepth: 16,
outputFrameDurationMs: 600
});
await voiceSDK.connect();
Raw WebSocket Format
If connecting via raw WebSocket (without SDK), send variables in the hello message:
{
"t": "hello",
"v": 2,
"variables": {
"USER_NAME": "John",
"ACCOUNT_TYPE": "premium",
"LANGUAGE": "en-US"
},
"inputFormat": {
"encoding": "pcm",
"sampleRate": 16000,
"channels": 1,
"bitDepth": 16
},
"requestedOutputFormat": {
"encoding": "pcm",
"sampleRate": 24000,
"channels": 1,
"bitDepth": 16,
"container": "raw"
},
"outputFrameDurationMs": 600
}
Variable Format
Variables are sent as a JSON object where:
- Keys: Variable names (case-sensitive, e.g.,
USER_NAME) - Values: String values that will replace
{{VARIABLE_NAME}}in the prompt
Example
{
"variables": {
"USER_NAME": "John",
"ACCOUNT_TYPE": "premium",
"LANGUAGE": "en-US",
"COMPANY": "Acme Corp"
}
}
Variable Replacement Priority
Variables are replaced in the following priority order:
- Hello Variables (highest priority) - Variables sent in the hello request
- Default Variables - Variables stored in agent configuration (Redis)
- Leave as-is - If no value found,
{{VARIABLE_NAME}}remains unchanged
Example Priority
Agent Configuration (Redis):
{
"USER_NAME": "David",
"ACCOUNT_TYPE": "premium"
}
Hello Request:
{
"variables": {
"USER_NAME": "John"
}
}
Result:
{{USER_NAME}}→"John"(from hello - takes precedence){{ACCOUNT_TYPE}}→"premium"(from defaults - hello doesn't override)
Usage Examples
Example 1: JavaScript/TypeScript with SDK v2
import { VoiceSDK_v2 } from 'ttp-agent-sdk';
const voiceSDK = new VoiceSDK_v2({
agentId: 'agent_5a2b984c1',
appId: 'app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC',
variables: {
USER_NAME: 'John Doe',
ACCOUNT_TYPE: 'premium',
LANGUAGE: 'en-US'
},
// ... audio format config
});
await voiceSDK.connect();
Example 2: Raw WebSocket (JavaScript)
const ws = new WebSocket('wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC');
ws.onopen = () => {
const helloMessage = {
t: 'hello',
v: 2,
variables: {
USER_NAME: 'John',
ACCOUNT_TYPE: 'premium',
LANGUAGE: 'en-US'
},
inputFormat: {
encoding: 'pcm',
sampleRate: 16000,
channels: 1,
bitDepth: 16
},
requestedOutputFormat: {
encoding: 'pcm',
sampleRate: 24000,
channels: 1,
bitDepth: 16,
container: 'raw'
},
outputFrameDurationMs: 600
};
ws.send(JSON.stringify(helloMessage));
};
Example 3: Python WebSocket
import websocket
import json
def on_open(ws):
hello_message = {
"t": "hello",
"v": 2,
"variables": {
"USER_NAME": "John",
"ACCOUNT_TYPE": "premium",
"LANGUAGE": "en-US"
},
"inputFormat": {
"encoding": "pcm",
"sampleRate": 16000,
"channels": 1,
"bitDepth": 16
},
"requestedOutputFormat": {
"encoding": "pcm",
"sampleRate": 24000,
"channels": 1,
"bitDepth": 16,
"container": "raw"
},
"outputFrameDurationMs": 600
}
ws.send(json.dumps(hello_message))
ws = websocket.WebSocketApp(
"wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC",
on_open=on_open
)
ws.run_forever()
Example 4: cURL / wscat
# Using wscat
wscat -c "wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC"
# Then send:
{"t":"hello","v":2,"variables":{"USER_NAME":"John","ACCOUNT_TYPE":"premium"},"inputFormat":{"encoding":"pcm","sampleRate":16000,"channels":1,"bitDepth":16},"requestedOutputFormat":{"encoding":"pcm","sampleRate":24000,"channels":1,"bitDepth":16,"container":"raw"},"outputFrameDurationMs":600}
Agent Prompt Setup
To use variables in your agent, include placeholders in the system prompt or first message:
System Prompt Example
Your name is {{AGENT_NAME}}.
You are helping {{USER_NAME}} who has a {{ACCOUNT_TYPE}} account.
Speak in {{LANGUAGE}}.
First Message Example
Hello {{USER_NAME}}! Welcome to our {{ACCOUNT_TYPE}} service.
Backend Processing
When the hello request is received with variables:
- Variables are extracted from the hello message
- Default variables are loaded from agent configuration (Redis)
- Variables are merged (hello variables take precedence)
- Prompt is processed -
{{VARIABLE_NAME}}placeholders are replaced - First message is processed - Variables are replaced here too
- Metadata is added - Variables metadata section is appended to prompt
Server Logs
After sending hello with variables, check server logs for:
📝 Processing variables from hello message: [USER_NAME, ACCOUNT_TYPE, LANGUAGE]
✅ Variables processed and prompt updated
📝 FINAL PROCESSED PROMPT (agentId: ...):
[Prompt with variables replaced]
✅ Processed variables: 3 hello variables, 2 default variables, metadata: added
Variable Naming Conventions
- Use UPPERCASE with underscores:
USER_NAME,ACCOUNT_TYPE - Variable names are case-sensitive
- Avoid special characters except underscores
- Recommended format:
{{VARIABLE_NAME}}in prompts
Common Use Cases
1. User Personalization
{
"variables": {
"USER_NAME": "John Doe",
"USER_EMAIL": "john@example.com"
}
}
2. Account Context
{
"variables": {
"ACCOUNT_TYPE": "premium",
"SUBSCRIPTION_STATUS": "active"
}
}
3. Language/Localization
{
"variables": {
"LANGUAGE": "en-US",
"CURRENCY": "USD"
}
}
4. Session Context
{
"variables": {
"SESSION_ID": "abc123",
"PAGE_URL": "https://example.com/products"
}
}
Error Handling
Missing Variables
If a variable is referenced in the prompt but not provided:
- Has default value: Uses default from agent configuration
- No default value: Placeholder remains unchanged (
{{VARIABLE_NAME}})
Invalid Variable Format
- Variables must be a JSON object
- Values should be strings (will be converted to string if needed)
- Empty object
{}ornullis valid (will use defaults only)
Best Practices
- Set defaults in agent configuration for all variables
- Override with hello variables only when you have dynamic values
- Use descriptive names that clearly indicate the variable's purpose
- Document variables in your agent's description or notes
- Test variables by checking server logs for "FINAL PROCESSED PROMPT"
API Reference
Hello Message Structure
interface HelloMessage {
t: "hello"; // Message type
v?: number; // SDK version (2 for v2)
variables?: { // Optional variables object
[key: string]: string; // Variable name -> value mapping
};
inputFormat?: AudioFormat; // Input audio format
requestedOutputFormat?: AudioFormat; // Output audio format
outputFrameDurationMs?: number; // Frame duration for streaming
}
interface AudioFormat {
encoding: "pcm" | "pcmu" | "pcma";
sampleRate: number;
channels: number;
bitDepth: number;
container?: "raw" | "wav"; // For output format only
}
Troubleshooting
Variables Not Being Replaced
- Check variable names match exactly (case-sensitive)
- Verify variables are sent in hello message (check logs)
- Check server logs for "FINAL PROCESSED PROMPT" to see actual replacement
Variables Not in Hello Message
- SDK v2: Check if SDK supports
variablesin constructor - Raw WebSocket: Ensure
variablesfield is included in JSON - Check WebSocket message is sent after connection opens
Default Variables Not Used
- Verify variables are stored in Redis (check agent configuration)
- Check
extractVariablesFromAgentConfigis working - Look for "Default variables not found in state" warnings in logs
Events & Callbacks
The SDK emits events for all important state changes and interactions.
Event Categories
Connection Events
voiceSDK.on('connected', () => {
console.log('✅ Connected to agent');
});
voiceSDK.on('disconnected', (event) => {
console.log('❌ Disconnected:', event.reason);
console.log('Close code:', event.code);
});
voiceSDK.on('error', (error) => {
console.error('Error:', error);
});
Recording Events
voiceSDK.on('recordingStarted', () => {
console.log('🎤 Recording started');
});
voiceSDK.on('recordingStopped', () => {
console.log('⏹️ Recording stopped');
});
Message Events
voiceSDK.on('message', (msg) => {
switch(msg.type) {
case 'agent_response':
console.log('Agent:', msg.agent_response);
break;
case 'transcription':
console.log('You said:', msg.text);
break;
// ... other message types
}
});
Audio Events
voiceSDK.on('playbackStarted', () => {
console.log('🔊 Audio playback started');
});
voiceSDK.on('playbackStopped', () => {
console.log('🔇 Audio playback stopped');
});
voiceSDK.on('audioData', (audioData) => {
// Raw audio data (Uint8Array)
});
Pause Events
// Call paused (server acknowledged)
voiceSDK.on('callPaused', (data) => {
console.log('⏸️ Call paused, timeout:', data.timeoutSeconds, 'seconds');
});
// Call resumed (server acknowledged, STT ready)
voiceSDK.on('callResumed', () => {
console.log('▶️ Call resumed');
});
// Pause timeout (call auto-ended because pause lasted too long)
voiceSDK.on('pauseTimeout', () => {
console.log('⏱️ Pause timeout — call ended');
});
Special Events
// Barge-in (user interrupts agent)
voiceSDK.on('bargeIn', (message) => {
console.log('User interrupted the agent');
});
// Format negotiation (v2 protocol only)
voiceSDK.on('formatNegotiated', (format) => {
console.log('Format negotiated:', format);
// format: { container, encoding, sampleRate, channels, bitDepth }
});
// Greeting audio
voiceSDK.on('greetingStarted', () => {
console.log('Playing greeting message');
});
// Domain whitelist error
voiceSDK.on('domainError', (error) => {
console.error('Domain not whitelisted:', error.reason);
});
// Server-driven disclaimer (voice v2) — see #server-driven-disclaimer
voiceSDK.on('disclaimersRequired', (payload) => {
// Show your UI, then call voiceSDK.sendDisclaimerAck(true|false)
});
voiceSDK.on('disclaimerRejected', ({ code, message }) => {
// DISCLAIMER_DECLINED, DISCLAIMER_TIMEOUT, DISCLAIMER_HASH_MISMATCH, etc.
});
Protocol v2 - Format Negotiation
The SDK v2 introduces format negotiation, allowing you to specify exactly what audio format you want to receive from the server.
Server-driven disclaimer (voice)
Some deployments must show exact legal or policy copy from the server before speech recognition and the agent greeting run. The conversation server can require an explicit acknowledgement step after hello_ack. This applies to VoiceSDK v2 (protocol version 2) and the Voice & Chat Widget voice path.
When the gate is active
- The agent has a non-empty disclaimers list in backend storage (Redis field
disclaimersas a JSON array of plain-text strings). An empty array[]means no disclaimer gate. - The session is not a resumed voice call (resume skips the gate).
While the gate is open, the server:
- Does not open STT or play the greeting.
- Rejects
start_continuous_modewith{"ok":false,"t":"error","code":"DISCLAIMER_PENDING",...}. - Drops uplink binary audio (microphone data) until the gate clears.
- Starts a server-side timer; if the user never acknowledges, the session is closed with
DISCLAIMER_TIMEOUT(duration is configured on the server, typically on the order of minutes).
iOS Safari: microphone uplink
Voice capture uses an AudioWorklet. On iPhone and iPad, WebKit often runs the capture AudioContext at the device hardware rate (commonly 44.1 kHz or 48 kHz) even when a lower rate was requested, while the server expects PCM at the input rate negotiated in hello_ack (typically 16 kHz). The SDK resamples uplink PCM to that negotiated rate before sending binary frames. The recorder connects the worklet through a zero-gain node to the destination so WebKit reliably pulls the processor (graphs that dead-end at the worklet alone may not run on some builds). On mobile, after getUserMedia succeeds, the SDK may prime the shared recorder context while user activation is still fresh, before the WebSocket handshake and hello_ack finish. For embedded widgets, use allow="microphone" on the iframe when the host page is cross-origin.
Embed sites: CSP and “Unable to load a worklet's module”
Strict Content-Security-Policy on the parent page can block audioWorklet.addModule() when the processor URL points at another host (for example cdn.talktopc.com), which surfaces as AbortError: Unable to load a worklet's module and voice never reaches the server. The SDK tries that URL first, then automatically retries using the capture worklet source bundled inside the widget script and a blob: URL—this works on many shops without whitelisting our CDN. If both attempts fail, relax CSP (often worker-src / script-src must allow blob:, and/or your CDN origin for the processor), or host audio-processor.js on your own domain and set voice.audioProcessorPath to that same-origin URL.
hello_ack fields (gate active)
When disclaimers apply, the server includes:
| Field | Type | Description |
|---|---|---|
disclaimersRequired |
boolean | Must be true when the gate is active |
disclaimerTexts |
string[] | Plain-text lines to show the user (no HTML; escape/sanitize in your UI) |
disclaimersHash |
string | SHA-256 (hex) over the canonical text list; echoed in disclaimer_ack for verification |
disclaimerTimeoutMs |
number | Hint for UI (countdown copy); server enforces the real timeout independently |
Client message: disclaimer_ack
After the user accepts or declines, send:
{
"t": "disclaimer_ack",
"accepted": true,
"disclaimersHash": "sha256-from-hello_ack",
"conversationId": "optional-matches-hello_ack"
}
For decline, set "accepted": false. Duplicate acks for the same session are ignored. The SDK method sendDisclaimerAck(accepted) builds this frame using the hash and conversation id from hello_ack.
Server error frames (t: "error")
code |
Meaning |
|---|---|
DISCLAIMER_PENDING |
Client tried to start continuous mode or stream audio before sending a successful ack; session stays open—send disclaimer_ack then retry |
DISCLAIMER_DECLINED |
User declined; server ends the conversation |
DISCLAIMER_TIMEOUT |
No acknowledgement in time; connection closed |
DISCLAIMER_HASH_MISMATCH |
Ack hash did not match server expectation; connection closed |
Using VoiceSDK v2 (custom apps)
- Use protocol v2 (default in current SDK):
protocolVersion: 2. - After
connect(), wait forhello_ack. The SDK setsvoiceSDK.disclaimersPending === truewhen the gate is active. - Listen for
disclaimersRequired. The payload includestexts,disclaimersHash,disclaimerTimeoutMs, andconversationIdfor your UI. - Show your own modal or screen with the given texts. Do not call
startRecording()until the user has accepted and you have calledsendDisclaimerAck(true)(callingstartRecording()whiledisclaimersPendingis still true throws witherror.code === 'DISCLAIMER_PENDING'). - On accept:
voiceSDK.sendDisclaimerAck(true). The server then opens STT, plays the greeting, and acceptsstart_continuous_mode. - On decline:
voiceSDK.sendDisclaimerAck(false). The server closes the session; the SDK emitsdisclaimerRejectedand the rawmessageevent for the error frame. - Handle
disclaimerRejectedfor terminal server outcomes (DISCLAIMER_DECLINED,DISCLAIMER_TIMEOUT,DISCLAIMER_HASH_MISMATCH). HandleerrorforDISCLAIMER_PENDING(ordering bug or race—fix by ack first).
const voiceSDK = new VoiceSDK({ agentId, appId, protocolVersion: 2 });
voiceSDK.on('disclaimersRequired', (payload) => {
showMyModal({
texts: payload.texts,
onAccept: () => voiceSDK.sendDisclaimerAck(true),
onDecline: () => voiceSDK.sendDisclaimerAck(false)
});
});
voiceSDK.on('disclaimerRejected', ({ code, message }) => {
console.warn('Disclaimer flow ended:', code, message);
});
voiceSDK.on('error', (err) => {
if (err.code === 'DISCLAIMER_PENDING') {
console.warn('Start recording only after sendDisclaimerAck(true)');
}
});
await voiceSDK.connect();
// Only after ack (or if disclaimersRequired never fired):
await voiceSDK.startRecording();
SDK state you may read: disclaimersPending, disclaimersHash, lastDisclaimerPayload (set from hello_ack). After sendDisclaimerAck runs, the SDK clears disclaimersPending locally when the WebSocket send succeeds.
Voice & Chat Widget (built-in behavior)
No extra widget options are required. When the server sends the disclaimer gate:
VoiceInterfacewaits after the WebSocket is up andhello_ackis processed.- It opens a built-in modal (Notice / Accept / Decline) with
disclaimerTextsfrom the server. - Accept: on desktop, the widget sends
sendDisclaimerAck(true)immediately, then requests the microphone andstartListening. On mobile, it waits until the user has granted microphone access (and the post-grant audio delay) before sendingsendDisclaimerAck(true), so the server does not stream the greeting over the system permission sheet or get cut off when capture starts. - Decline (No thanks) calls
sendDisclaimerAck(false), waits briefly so the ack reaches the server (which ends the conversation and closes the socket), then disconnects the client, invokesonConversationEndon the wrapper SDK since recording never started, resets UI, and returns to landing (or idle voice invoice-onlymode).
To match your site language, use the widget’s existing language / translation hooks for general UI; disclaimer body text always comes from the server (compliance copy).
Resume & text chat
Resume: Resumed voice sessions omit the disclaimer gate.
Text chat: Server-driven disclaimer is implemented for voice in this release. The text WebSocket hello path does not yet mirror these fields; extend the backend and TextChatSDK if you need the same gate for text-only sessions.
- Format Control: Request specific audio formats (container, encoding, sample rate, bit depth)
- Automatic Conversion: SDK automatically converts audio if backend sends different format
- Quality Optimization: Choose optimal formats for your use case (e.g., 48kHz for high quality, 8kHz for bandwidth savings)
- Protocol Support: Uses v2 protocol with format negotiation
Supported Formats
Input Formats (What SDK Sends)
| Property | Supported Values |
|---|---|
encoding |
'pcm', 'pcmu' (μ-law), 'pcma' (A-law) |
sampleRate |
8000, 16000, 22050, 24000, 44100, 48000 Hz |
bitDepth |
8, 16, 24 bits |
channels |
1 (mono only) |
Output Formats (What SDK Receives)
| Property | Supported Values |
|---|---|
container |
'raw' (no header), 'wav' (with WAV header) |
encoding |
'pcm', 'pcmu' (μ-law), 'pcma' (A-law) |
sampleRate |
8000, 16000, 22050, 24000, 44100, 48000 Hz |
bitDepth |
8, 16, 24 bits |
channels |
1 (mono only) |
Format Negotiation Flow
Configure requested output format
SDK sends format request in hello message
Server sends hello_ack with negotiated format
SDK emits 'formatNegotiated' event
If formats differ, SDK converts automatically
Example: High-Quality Audio
const voiceSDK = new VoiceSDK({
agentId: 'agent_123',
appId: 'your_app_id',
// Request high-quality audio
outputContainer: 'raw', // Raw PCM for lower latency
outputEncoding: 'pcm', // Uncompressed PCM
outputSampleRate: 48000, // 48kHz for high quality
outputBitDepth: 16, // 16-bit depth
outputChannels: 1, // Mono
outputFrameDurationMs: 600, // 600ms frames
protocolVersion: 2 // Enable format negotiation
});
voiceSDK.on('formatNegotiated', (format) => {
console.log('Negotiated format:', format);
// If backend sends different format, SDK will convert automatically
});
Example: Bandwidth-Optimized Audio
const voiceSDK = new VoiceSDK({
agentId: 'agent_123',
appId: 'your_app_id',
// Request compressed, low-bandwidth audio
outputContainer: 'raw',
outputEncoding: 'pcmu', // μ-law compression (8kHz equivalent)
outputSampleRate: 8000, // 8kHz for bandwidth savings
outputBitDepth: 16,
outputChannels: 1,
protocolVersion: 2
});
Format Conversion
If the backend sends audio in a different format than requested, the SDK automatically converts it:
- Container: WAV ↔ Raw PCM extraction/wrapping
- Encoding: PCM ↔ PCMU/PCMA encoding/decoding
- Sample Rate: Automatic resampling using Web Audio API
- Bit Depth: 8-bit ↔ 16-bit ↔ 24-bit conversion
- Channels: Mono/stereo conversion (if needed)
- Use
protocolVersion: 2for new projects - Request formats that match your use case (quality vs. bandwidth)
- 48kHz is recommended for best quality (matches most browser defaults)
- Raw PCM is lower latency than WAV (no header overhead)
- Listen to
formatNegotiatedevent to verify format
Voice & Chat Widget
Pre-built, customizable widget with voice and text chat - perfect for adding AI conversation to any website.
Agent display name
- Set the name on the voice idle hero (and the letter inside the avatar when no image URL is set) with the root property
agentNameonly. - Use
header.pillTitlefor a short CTA on the desktop floating pill and the mobile FAB (e.g. “Talk to me” / “דברו איתי”); when empty, both useheader.title. Override the mobile FAB line only withheader.mobileLabel. Separate fromagentName. voice.agentNameis not supported — it is removed from config when the widget merges settings.- Legacy root
headerTitleis ignored. UseagentNamefor the hero name andheader.titlefor the general assistant title line (pill / mobile landing fallback). - In unified mode, the text chat screen has a top bar with Voice or text (widget translation key
backToModeChoice) to return to the voice/text choice inside the panel.
Voice & Text Chat
Beautiful interface with voice recording, text chat, and message history
Fully Customizable
Colors, position, size, RTL support, and custom branding
Mobile Optimized
Responsive design that works perfectly on all devices
Multi-language
Built-in support for multiple languages with custom translations
Installation
<!-- Add the SDK script to your page -->
<script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>
<!-- Initialize the widget -->
<script>
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id'
});
</script>
Basic Configuration
const widget = new TTPAgentSDK.TTPChatWidget({
// Required
agentId: 'agent_123', // Your AI agent ID
appId: 'your_app_id', // Your application ID
// Optional — root only (voice idle hero name & avatar initial when no image)
agentName: 'Alex',
// Optional - Agent Settings Override (available when domain whitelist is configured)
agentSettingsOverride: {
prompt: "You are a helpful customer service assistant.",
temperature: 0.8,
voiceId: "F2",
voiceSpeed: 1.2,
firstMessage: "Hello! How can I help you today?",
disableInterruptions: false,
maxCallDuration: 600
},
// Optional - Appearance
primaryColor: '#7C3AED', // Widget theme color
position: { // Or shorthand: 'bottom-right', 'bottom-left'
vertical: 'bottom',
horizontal: 'right',
offset: { x: 20, y: 20 },
draggable: false, // true = user can drag launcher + panel
draggablePersist: true // remember drag position in localStorage
},
language: 'en', // 'en', 'es', 'fr', 'de', 'he', etc.
direction: 'ltr', // 'ltr' or 'rtl' for right-to-left languages
// Optional - Variables
variables: {
userName: 'John Doe',
page: 'homepage',
customData: 'value'
}
});
Server-driven disclaimer (voice)
If your agent has non-empty disclaimers configured on the server, the widget shows a Notice modal with the server-provided text before microphone streaming and the greeting. The user must tap Accept or Decline; you do not need extra embed code. For protocol fields, SDK hooks, and custom implementations, see Server-driven disclaimer (voice).
Access Control
The widget connects directly using agentId and appId. No backend authentication step is needed:
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id',
variables: {
userName: 'John Doe',
page: 'homepage'
}
});
- The widget only needs
agentIdandappIdto connect - Access control is managed via domain whitelist in the admin panel
- WebSocket URLs use the production TalkToPC endpoint by default (same as voice). Pass
websocketUrlonly if you need a non-default backend.
Advanced Customization
Icon Customization
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id',
icon: {
type: 'custom', // 'microphone', 'emoji', or 'custom'
// Omit customImage (or use '') for default animated waveform on the desktop pill
customImage: 'https://your-site.com/logo.png', // Optional: pill icon image URL
size: 60, // Icon size in pixels
backgroundColor: '#FFFFFF', // Background color
borderRadius: '50%' // Border radius (50% for circle)
}
});
Chat Window Customization
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id',
chatWindow: {
width: 400, // Width in pixels
height: 600, // Height in pixels
title: 'Chat with us!', // Custom title
subtitle: 'We reply instantly', // Custom subtitle
placeholder: 'Type here...', // Input placeholder
borderRadius: 12 // Window border radius
}
});
Branding
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id',
branding: {
companyName: 'Your Company',
logo: 'https://your-site.com/logo.png',
showPoweredBy: false // Hide "Powered by TTP" footer
}
});
Agent Settings Override
Dynamically customize agent behavior, voice, and personality on a per-session basis:
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id',
// Override agent settings dynamically
agentSettingsOverride: {
// Core settings
prompt: "You are a friendly customer service assistant",
temperature: 0.8,
maxTokens: 200,
// Voice settings
voiceId: "F2",
voiceSpeed: 1.2,
// Behavior
firstMessage: "Hello! How can I help you today?",
disableInterruptions: false,
maxCallDuration: 600,
// Language
language: "en",
autoDetectLanguage: false
}
});
See the Agent Settings Override section for complete documentation of all available override settings.
RTL (Right-to-Left) Support
// For Hebrew, Arabic, etc.
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id',
direction: 'rtl',
language: 'he', // Hebrew
position: 'bottom-left' // Better for RTL
});
Widget Methods
widget.open()
Programmatically open the chat window.
// Open chat from your own button
document.getElementById('myButton').onclick = () => {
widget.open();
};
widget.close()
Close the chat window.
widget.close();
widget.toggle()
Toggle chat window open/closed.
widget.toggle();
widget.minimize()
Collapse the chat panel back to the round launcher pill — same visual
state as if the user had clicked the launcher while the panel was open.
Idempotent (no-op when already minimized) and does not
end an active voice call: the WebSocket and conversation state are
preserved, the user just sees the bubble until they re-open. Also
triggered automatically by a backend-pushed
{ t: 'minimize_widget' } control message, so partner
integrations can choreograph the chat panel from the server side
(e.g. minimize the widget when opening a native trolley drawer).
widget.minimize();
widget.maximize()
Expand the chat panel from the round launcher pill — same visual
state as if the user had clicked the launcher while the panel was
closed. Idempotent (no-op when already open) and runs the same
auto-connect side-effect a real click would when configured. Also
triggered automatically by a backend-pushed
{ t: 'maximize_widget' } control message.
widget.maximize();
widget.destroy()
Remove the widget from the page.
widget.destroy();
widget.updateConfig(config)
Update widget configuration dynamically.
widget.updateConfig({
primaryColor: '#FF5733',
language: 'es',
agentName: 'Jordan' // root only; voice.agentName in this object is ignored
});
Widget Event Callbacks
Pass these as top-level config properties when constructing TTPChatWidget:
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id',
onConversationStart: () => {
console.log('Voice conversation started');
},
onConversationEnd: () => {
console.log('Voice conversation ended');
},
onBargeIn: () => {
console.log('User interrupted the agent');
},
onAudioStartPlaying: () => {
console.log('Agent audio started');
},
onAudioStoppedPlaying: () => {
console.log('Agent audio stopped');
},
onSubtitleDisplay: (subtitle) => {
console.log('Subtitle:', subtitle);
},
onVoiceCallButtonClick: () => {
// Return false to prevent starting the call
return true;
}
});
For lower-level VoiceSDK events (onConnected, onMessage, etc.), use widget.voiceInterface.sdk or instantiate VoiceSDK directly. See Events & Callbacks.
Complete Example
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>My Website with AI Chat</title>
</head>
<body>
<h1>Welcome to my website!</h1>
<!-- Load the SDK -->
<script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>
<!-- Initialize widget -->
<script>
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'agent_123',
appId: 'your_app_id',
// Customize appearance
primaryColor: '#7C3AED',
position: 'bottom-right',
language: 'en',
// Custom branding
chatWindow: {
title: 'Chat with us!',
subtitle: 'We typically reply instantly'
},
// Pass context variables
variables: {
userName: 'Visitor',
page: window.location.pathname,
referrer: document.referrer
},
// Event handlers
onReady: () => {
console.log('Chat widget ready!');
},
onMessage: (message) => {
// Track messages in analytics
console.log('Message:', message);
}
});
// Optional: Open chat programmatically
// widget.open();
</script>
</body>
</html>
Configuration Reference
🎨 Extensive Customization
The Voice & Chat Widget can be customized in almost every aspect - colors, text, icons, sizes, behaviors, and more!
Agent naming: use root agentName for the voice idle hero and avatar letter fallback. Do not use voice.agentName (removed on merge). Do not use legacy root headerTitle (ignored). See Agent display name above.
See the live customization demo to experiment with all options interactively, including quick themes (Default, Light, Sunset, Hebrew, S-Law).
Required Configuration
| Property | Type | Description |
|---|---|---|
agentId |
string | Your AI agent identifier |
appId |
string | Your application identifier |
General Configuration
| Property | Type | Default | Description |
|---|---|---|---|
primaryColor |
string | '#7C3AED' | Main theme color (hex) |
direction |
string | 'ltr' (or 'rtl' when language is he / ar) |
'ltr' or 'rtl'. If omitted, Hebrew and Arabic default to 'rtl'. Sets the shadow host dir so the desktop pill launcher matches the panel (fixes extra padding beside the logo on RTL sites). In 'rtl', transcript, bubbles, and mobile bar use RTL punctuation order. |
language |
string | 'en' | Language code (en, es, fr, de, he, ar, etc.) |
agentName |
string | — | Root only. Name on the voice idle hero and first-letter avatar fallback when no header image is set. If omitted or empty, the widget defaults to Sasha. Color: voice.agentNameColor. Not the desktop pill text — use header.pillTitle or header.title for that. |
headerTitle |
string | — | Deprecated / ignored. Former optional root field; it is no longer read. Use agentName for the voice hero name and header.title for the shared assistant title line. |
variables |
object | {} | Custom variables to pass to agent |
websocketUrl |
string | wss://speech.talktopc.com/ws/conv (built-in) |
Optional override for voice and text. Omit unless you point at a custom backend; text still uses /chat/text on the same host as this base. |
agentSettingsOverride |
object | null | Override agent settings dynamically. See Agent Settings Override for details. |
customStyles |
string | '' | Custom CSS to inject |
useShadowDOM |
boolean | true | Enable Shadow DOM for CSS isolation. Set to false for Shopify compatibility. See Shadow DOM Configuration below. |
mobileVoiceUI |
boolean | auto | Root-level override. Force mobile-style voice call UI (bottom bar + overlay) or desktop hero. Takes precedence over behavior.mobileVoiceUI when both are set. Omit for auto: native iOS/Android, or viewport ≤768px with touch/coarse pointer. |
inputFormat |
object | — | Optional v2 input audio format passed to the voice WebSocket hello message: { encoding, sampleRate, channels, bitDepth }. Same fields as top-level VoiceSDK config (sampleRate, etc.). See Protocol v2. |
visualAssistant |
object | null | Enable browser-side visual assistant tools (page read, highlight, scroll, navigate, form fill, click, screenshot). Also accepted under agentSettingsOverride.visualAssistant. See Visual Assistant below. |
whatsapp |
object | null | Optional WhatsApp handoff on the voice idle hero: { number: '972501234567', text: 'optional pre-filled message' }. Non-digits are stripped from number. Omit to hide the WhatsApp button. |
Shadow DOM Configuration (useShadowDOM)
What is Shadow DOM?
Shadow DOM is a web standard that provides CSS isolation by creating a separate DOM tree that doesn't inherit styles from the parent page. This prevents theme CSS from interfering with the widget's appearance.
When to Use Shadow DOM:
- ✅ WordPress: Use Shadow DOM (
useShadowDOM: trueor omit - defaults totrue) - ✅ Most platforms: Shadow DOM works well on most websites and platforms
- ❌ Shopify: Disable Shadow DOM (
useShadowDOM: false) due to rendering issues
Why We Need This Option:
While Shadow DOM provides excellent CSS isolation, some platforms (notably Shopify) have rendering issues where Shadow DOM elements render with 0x0 dimensions, making the widget invisible. By setting useShadowDOM: false, the widget uses regular DOM with targeted CSS resets instead, ensuring visibility while still protecting against most theme conflicts.
Platform-Specific Recommendations:
| Platform | Recommended Setting | Reason |
|---|---|---|
| WordPress | useShadowDOM: true (default) |
Shadow DOM works perfectly and provides better CSS isolation |
| Shopify | useShadowDOM: false |
Shadow DOM elements render with 0x0 dimensions, making widget invisible |
| Wix | useShadowDOM: true (default) |
Shadow DOM works well on Wix |
| Custom Websites | useShadowDOM: true (default) |
Use Shadow DOM unless you experience rendering issues |
Example Usage:
// WordPress (default - Shadow DOM enabled)
const widget = new TTPChatWidget({
agentId: 'your-agent-id',
// useShadowDOM defaults to true, so widget is isolated from theme CSS
});
// Shopify (disable Shadow DOM)
const widget = new TTPChatWidget({
agentId: 'your-agent-id',
useShadowDOM: false // Required for Shopify compatibility
});
// If widget is invisible, try disabling Shadow DOM
const widget = new TTPChatWidget({
agentId: 'your-agent-id',
useShadowDOM: false // Fallback if Shadow DOM causes rendering issues
});
How It Works:
- With Shadow DOM (
useShadowDOM: true): Widget is rendered inside a Shadow DOM tree, completely isolated from page CSS. Styles are injected into the shadow root. - Without Shadow DOM (
useShadowDOM: false): Widget is rendered in regular DOM. Styles are injected into the document<head>with high-specificity selectors to prevent theme CSS conflicts. Targeted CSS resets protect against common theme issues while preserving widget functionality.
⚠️ Important Notes:
- When
useShadowDOM: false, the widget uses targeted CSS resets instead of aggressive resets to preserve internal widget styles - If you experience layout issues with
useShadowDOM: false, check if your theme's CSS is overriding widget styles - you may need to add more specific CSS rules - Unified mode: With
useShadowDOM: false, the widget must show only one of voice or text at a time inside the panel. A previous reset rule forceddisplay:flexon multiple roots (higher specificity than the hide rules), which made them stack vertically on Shopify; this is fixed in current builds. - Voice call UI: Call duration updates and the red recording dot use DOM queries scoped to
#ttp-widget-container(notdocument), and extra CSS guards the dot and pulse when themes override spans — fixes static timers and missing dots on Shopify. - Orb waveform: Bars are injected under
#ttp-widget-container. WithuseShadowDOM: false, the same CSSttp-wavekeyframes and per-bar delays are used as in Shadow DOM; high-specificity rules protect size and animation without fixingopacity(which would break the keyframe fade). - The widget automatically handles CSS injection differently based on this setting - no additional configuration needed
Positioning
| Property | Type | Default | Description |
|---|---|---|---|
position |
string | object | 'bottom-right' | String: 'bottom-right', 'bottom-left'. Object: { vertical, horizontal, offset } |
position.vertical |
string | 'bottom' | 'top' or 'bottom' |
position.horizontal |
string | 'right' | 'left' or 'right' |
position.offset |
object | { x: 20, y: 20 } | Offset from edges (pixels) |
position.draggable |
boolean | false | When true, visitors can drag the launcher pill / mobile FAB and the open chat panel around the viewport. They move together as one unit (never split). Drag handles: the pill/FAB and panel header bars (voice idle header, active-call top bar, text chat top bar). Header buttons (close, etc.) still work normally. |
position.draggablePersist |
boolean | true | When true (default), the dragged position is saved in localStorage per origin+path and restored on the next visit. Set false to reset to the configured corner on every load. The widget unit and the desktop minimized voice strip (flavor.callView: 'minimized') remember positions independently. |
positionOffset |
object | — | Legacy. Used only when position is a string (e.g. 'bottom-right') instead of an object. Prefer position.offset. |
When position.draggable is enabled, the desktop minimized voice strip (flavor.callView: 'minimized') is also draggable by its body (controls and text input still work). Positions are clamped to stay on-screen and re-clamped on window resize.
Icon & Button
| Property | Type | Default | Description |
|---|---|---|---|
icon.type |
string | 'custom' | 'microphone', 'custom', 'emoji', 'text' |
icon.customImage |
string | — | Optional HTTPS image URL for the desktop floating pill icon. When omitted or empty, the pill uses the same animated waveform as the mobile launcher and pre-chat landing (white bars on a frosted circle). |
icon.size |
string | 'medium' | 'small', 'medium', 'large', 'xl' |
icon.backgroundColor |
string | '#FFFFFF' | Icon background color |
button.size |
string | 'medium' | 'small', 'medium', 'large' |
button.shape |
string | 'circle' | 'circle', 'rounded', 'square' |
button.backgroundColor |
string | primaryColor | Button background color |
button.hoverColor |
string | '#7C3AED' | Button hover color |
button.shadow |
boolean | true | Enable button shadow |
Panel & Header
| Property | Type | Default | Description |
|---|---|---|---|
panel.width |
number | 350 | Panel width (pixels) |
panel.height |
number | 500 | Panel height (pixels) |
panel.borderRadius |
number | 12 | Border radius (pixels) |
panel.backgroundColor |
string | '#FFFFFF' | Panel background color |
header.title |
string | 'Chat Assistant' | Default title when more specific labels are omitted: fallback for the desktop and mobile launchers if header.pillTitle is not set (unless header.mobileLabel is set), and for the mobile pre-chat landing name when root agentName is not set. Does not set the voice idle hero name — use root agentName for that. |
header.pillTitle |
string | '' | Optional main line on the desktop floating pill and the mobile pill launcher (same fallback chain as desktop: when empty, uses header.title). Set header.mobileLabel if the FAB needs different copy than the desktop pill. Does not change the voice idle hero name — use root agentName for that. |
header.showTitle |
boolean | true | Show/hide header title |
header.backgroundColor |
string | '#7C3AED' | Header background color |
header.textColor |
string | '#FFFFFF' | Header text color |
header.mobileLabel |
string | — | Optional: overrides the mobile pill launcher’s main line only. When omitted, the FAB uses the same text as the desktop pill (header.pillTitle or header.title). |
header.showCloseButton |
boolean | true | Show or hide the panel close button in the header. |
header.onlineIndicatorText |
string | Auto | Online status label on the desktop pill, mobile FAB, and headers. When omitted, uses the translated “Online” string for the widget language. |
header.onlineIndicatorColor |
string | header.textColor | Text color for the online indicator label. |
header.onlineIndicatorDotColor |
string | '#10b981' | Color of the online status dot next to the indicator text. |
footer.show |
boolean | true | Show or hide the "Powered by" footer. |
footer.brand |
string | 'talktopc' | Brand shown in the "Powered by" footer. 'talktopc' links to talktopc.com; 'speacart' links to speacart.com. Any other value defaults to TalkToPC. |
footer.backgroundColor |
string | '#f9fafb' | Footer background color. |
footer.textColor |
string | '#9ca3af' | Footer text color. |
footer.hoverColor |
string | '#7C3AED' | Footer link hover color. |
Messages & Chat
On desktop, when the text chat view is open, the panel uses a fixed height (up to min(520px, 100vh − 100px)) so new messages scroll inside the transcript instead of stretching the card. By default (text.useVoiceTheme not false), the text UI shares the voice theme: voice.heroGradient1 / heroGradient2 surface (same as the voice idle hero), voice.primaryBtnGradient* and startCallButtonColor for send/focus and user bubbles (#primary40 tint), avatarGradient1/2 on the assistant avatar, and transcript-style translucent inputs. The “Powered by” footer uses the same link accent as on the hero. Set text.useVoiceTheme: false for a light layout driven by solid panel.backgroundColor and messages.* colors.
| Property | Type | Default | Description |
|---|---|---|---|
messages.userBackgroundColor |
string | '#E5E7EB' | User message background |
messages.agentBackgroundColor |
string | '#F3F4F6' | Agent message background |
messages.systemBackgroundColor |
string | '#DCFCE7' | System message background |
messages.errorBackgroundColor |
string | '#FEE2E2' | Error message background |
messages.textColor |
string | '#1F2937' | Message text color (fallback when role-specific colors are omitted) |
messages.userTextColor |
string | messages.textColor | User message text color |
messages.agentTextColor |
string | messages.textColor | Agent message text color |
messages.userAvatarIcon |
string | '👤' | Emoji/icon shown on user message avatars |
messages.agentAvatarIcon |
string | '🤖' | Emoji/icon shown on agent message avatars |
messages.fontSize |
string | '16px' | Message font size |
messages.borderRadius |
number | 8 | Message bubble radius |
text.useVoiceTheme |
boolean | true |
When true, text chat matches the voice idle hero gradient, primary gradients, avatar gradients, and dark translucent bubbles/inputs (see voice theme). When false, chrome follows solid panel.backgroundColor (hex) and messages.userBackgroundColor / agentBackgroundColor. |
text.sendButtonColor |
string | voice accent | Send button fill; defaults to the first resolvable hex from voice.primaryBtnGradient1, voice.primaryBtnGradient2, or voice.startCallButtonColor (including the default start-call indigo when those are unset), then #7C3AED. |
text.sendButtonHoverColor |
string | voice accent 2 / shaded | Hover state; uses voice.primaryBtnGradient2 when it resolves to a different hex than the send color, otherwise a slightly darker shade of the accent. |
text.sendButtonActiveColor |
string | same as hover | Active press state; same resolution as hover. |
text.sendButtonText |
string | '➤' | Send button text/icon |
text.sendButtonTextColor |
string | '#FFFFFF' | Send button text color |
text.sendButtonFontSize |
string | '20px' | Send button font size |
text.sendButtonFontWeight |
string | '500' | Send button font weight |
text.inputPlaceholder |
string | 'Type your message...' | Input placeholder text |
text.inputBorderColor |
string | '#E5E7EB' | Input border color |
text.inputFocusColor |
string | voice accent | Input focus border and ring; same default chain as text.sendButtonColor. |
text.inputBackgroundColor |
string | '#FFFFFF' | Input background color |
text.inputTextColor |
string | '#1F2937' | Input text color |
text.inputFontSize |
string | '16px' | Input font size |
text.inputBorderRadius |
number | 20 | Input border radius (pixels) |
text.inputPadding |
string | '8px 16px' | Input padding |
text.sendButtonHint.text |
string | '' | Optional hint text below or near the send button |
text.sendButtonHint.color |
string | '#6B7280' | Send button hint text color |
text.sendButtonHint.fontSize |
string | '14px' | Send button hint font size |
Voice Configuration
The agent’s spoken/idle display name is configured with root agentName only. Any voice.agentName value in your JSON is stripped during merge and has no effect (widget.updateConfig({ voice: { agentName: '…' } }) is ignored for naming).
| Property | Type | Default | Description |
|---|---|---|---|
voice.micButtonColor |
string | primaryColor | Microphone button color (inside panel) |
voice.micButtonActiveColor |
string | '#EF4444' | Microphone button color when active |
voice.micButtonHint.text |
string | 'Click the button to start...' | Hint text below mic button |
voice.micButtonHint.color |
string | '#6B7280' | Hint text color |
voice.avatarBackgroundColor |
string | '#667eea' | Voice avatar background |
voice.avatarActiveBackgroundColor |
string | '#667eea' | Avatar background when active |
voice.statusTitleColor |
string | '#1e293b' | Status title text color |
voice.statusSubtitleColor |
string | '#64748b' | Status subtitle text color |
voice.startCallTitle |
string | null | Custom text for "Click to Start Call" title (bypasses translations) |
voice.startCallSubtitle |
string | null | Custom text for "Real-time voice conversation" subtitle (bypasses translations) |
voice.startCallButtonText |
string | null | Custom text for "Start Call" button (bypasses translations) |
voice.startCallButtonColor |
string | '#667eea' | Fills the start-call button gradient when primaryBtnGradient1/2 are omitted. The “TalkToPC” footer link uses the same solid accent as the start button (primaryBtnGradient1, then startCallButtonColor, then defaults). |
voice.startCallButtonTextColor |
string | '#FFFFFF' | Start call button text color |
voice.endCallButtonColor |
string | '#ef4444' | End call button color |
voice.transcriptBackgroundColor |
string | '#FFFFFF' | Transcript background |
voice.transcriptTextColor |
string | '#1e293b' | Transcript text color |
voice.transcriptLabelColor |
string | '#94a3b8' | Transcript label color |
voice.userTranscriptPrefix |
string | null | null | Prefix before live user speech (STT) in the collapsed transcript strip and mobile bar (e.g. "You: "). null uses the translation key userTranscriptPrefix for the widget language; empty string removes the prefix. |
voice.controlButtonColor |
string | '#FFFFFF' | Control button color |
voice.controlButtonSecondaryColor |
string | '#64748b' | Secondary control button color |
voice.language |
string | 'en' | Voice language (overrides global) |
voice.statusDotColor |
string | '#10b981' | Status dot color on the voice idle screen |
voice.statusText |
string | null | null | Custom status line on the voice idle screen. null uses the translated default. |
voice.outputContainer |
string | 'raw' | Output audio container for v2 format negotiation: 'raw' or 'wav' |
voice.outputEncoding |
string | 'pcm' | Output encoding: 'pcm', 'pcmu', or 'pcma' |
voice.outputSampleRate |
number | 24000 | Requested output sample rate (Hz): 8000, 16000, 22050, 24000, 44100, or 48000 |
voice.outputChannels |
number | 1 | Output channels (mono only supported) |
voice.outputBitDepth |
number | 16 | Output bit depth: 8, 16, or 24 |
Voice Theming & Pill Launcher
🎨 Full Voice UI Theming
Customize every visual aspect of the voice interface — hero section, buttons, active call screen, pill launcher, and more. Use these properties to create branded themes or match your website's design.
The desktop pill has a fixed width (158px) on viewports ≥769px with a taller vertical layout (padding, 36px icon circle, 13px/11px title/status). Long titles ellipsize. RTL uses logical padding so the logo sits evenly on the start edge.
Pill Launcher
| Property | Type | Default | Description |
|---|---|---|---|
voice.pillGradient | string | '' | CSS gradient for the mobile floating pill, the mobile pre-chat landing sheet (.ttp-mobile-landing), and the mobile in-call bottom bar plus the expanded conversation header—same token everywhere. When unset, the widget uses the same default three-stop purple as the landing sheet (linear-gradient(135deg, #581c87, #312e81, #1e1b4b)). Example: linear-gradient(135deg, #7c3aed, #6d28d9). |
voice.pillTextColor | string | '#ffffff' | Text color on the pill launcher |
voice.pillDotColor | string | '#4ade80' | Online-status dot color on the pill |
Hero / Idle Screen
| Property | Type | Default | Description |
|---|---|---|---|
agentName | string | — | Root only (duplicate of General Configuration). Idle hero name and avatar initial; default Sasha when unset/empty. Mobile pre-chat landing: agentName if set, else header.title. |
voice.avatarGradient1 | string | '#6d56f5' | Avatar gradient start color |
voice.avatarGradient2 | string | '#a78bfa' | Avatar gradient end color |
voice.headerAvatarImageUrl | string | '' | Optional https (or http) URL for the idle voice header circle (desktop and mobile pre-call). When missing or invalid, the UI shows the first letter of the agent name inside the circle (e.g. "S" for Sasha). Same key may be set at the top level as headerAvatarImageUrl. Legacy: voice.avatarImageUrl (and snake_case header_avatar_image_url / avatar_image_url on voice or root) are also accepted. Invalid or non-http(s) URLs are ignored. |
voice.onlineDotColor | string | '#22c55e' | Online dot next to avatar |
voice.heroGradient1 | string | '#2a2550' | Hero gradient start: desktop voice panel (idle and in-call), widget footer, and .voice-interface.active use linear-gradient(160deg, heroGradient1, heroGradient2). Mobile in-call chrome uses voice.pillGradient instead (see Pill Launcher). |
voice.heroGradient2 | string | '#1a1a2e' | Hero gradient end; pairs with heroGradient1 for desktop/active surfaces above—not for the mobile minimized bar (use pillGradient). |
voice.agentNameColor | string | '#f0eff8' | Agent name text color |
voice.agentRoleColor | string | 'rgba(255,255,255,0.35)' | Agent role text color |
voice.agentRole | string | 'AI Voice Assistant' | Agent role label |
voice.headlineColor | string | '#ffffff' | Hero headline text color |
voice.headline | string | 'Hi there 👋' | Hero headline text |
voice.sublineColor | string | 'rgba(255,255,255,0.45)' | Hero subline text color |
voice.subline | string | 'Ask me anything...' | Hero subline text (supports HTML) |
Primary & Secondary Buttons
| Property | Type | Default | Description |
|---|---|---|---|
voice.primaryBtnGradient1 | string | '#6d56f5' | "Start Voice Call" button gradient start; footer “TalkToPC” link uses the same resolved accent (after startCallButtonColor fallback when gradients are omitted). Mobile user bubbles and send accent use this palette with primaryBtnGradient2 / sendButtonColor. |
voice.primaryBtnGradient2 | string | '#9d8df8' | "Start Voice Call" button gradient end; pairs with sendButtonColor for mobile message/send styling. |
voice.startCallButtonTextColor | string | '#FFFFFF' | Primary button text color |
voice.startCallButtonText | string | 'Start Voice Call' | Primary button label |
voice.sendMessageText | string | 'Send a Message' | Secondary button label |
voice.secondaryBtnBg | string | 'rgba(255,255,255,0.05)' | Secondary button background |
voice.secondaryBtnBorder | string | 'rgba(255,255,255,0.09)' | Secondary button border color |
voice.secondaryBtnTextColor | string | 'rgba(255,255,255,0.6)' | Secondary button text color |
Active Call View
| Property | Type | Default | Description |
|---|---|---|---|
voice.waveformBarColor | string | '#7C3AED' | Waveform bar color during call |
voice.speakerButtonColor | string | '#FFFFFF' | Speaker button color |
Quick Theme Example
Apply a complete light theme:
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'your-agent-id',
appId: 'your-app-id',
panel: {
backgroundColor: '#ffffff',
border: '1px solid rgba(0,0,0,0.06)'
},
voice: {
pillGradient: 'linear-gradient(135deg, #7c3aed, #6d28d9)',
pillTextColor: '#ffffff',
heroGradient1: '#ede9fe',
heroGradient2: '#f5f3ff',
agentNameColor: '#1e1b4b',
headlineColor: '#1e1b4b',
sublineColor: '#6b7280',
primaryBtnGradient1: '#7c3aed',
primaryBtnGradient2: '#a78bfa',
secondaryBtnBg: '#f5f3ff',
secondaryBtnBorder: 'rgba(124,58,237,0.15)',
secondaryBtnTextColor: '#6d28d9'
}
});
RTL Hebrew theme example:
const widget = new TTPAgentSDK.TTPChatWidget({
agentId: 'your-agent-id',
appId: 'your-app-id',
direction: 'rtl',
header: { title: 'עוזרת חכמה', onlineIndicatorText: 'מחוברת' },
panel: { backgroundColor: '#0f172a' },
voice: {
pillGradient: 'linear-gradient(135deg, #1e3a5f, #1e40af, #0f172a)',
avatarGradient1: '#3b82f6',
avatarGradient2: '#1d4ed8',
heroGradient1: '#1a2744',
heroGradient2: '#0f172a',
primaryBtnGradient1: '#3b82f6',
primaryBtnGradient2: '#1d4ed8',
startCallButtonText: 'התחל שיחה קולית',
sendMessageText: 'שלח הודעה',
agentRole: 'עוזרת קולית חכמה',
headline: 'היי, מה שלומך? 👋',
subline: 'שאל/י אותי הכל — אני עונה מיידית בקול או בטקסט.'
}
});
Behavior
| Property | Type | Default | Description |
|---|---|---|---|
behavior.mode |
string | 'unified' | 'unified' (both), 'voice-only', 'text-only' |
behavior.autoOpen |
boolean | false | Auto-open widget on page load |
behavior.startOpen |
boolean | false | Start with widget open |
behavior.hidden |
boolean | false | Hide the widget completely |
behavior.mobileVoiceUI |
boolean | auto | Force mobile-style voice call UI (bottom bar + overlay) or desktop hero. Omit for auto: native iOS/Android, or viewport ≤768px with touch/coarse pointer (helps in-app browsers with desktop User-Agent). Root-level mobileVoiceUI overrides this if both are set. |
behavior.autoConnect |
boolean | false | Auto-connect on widget open |
behavior.showWelcomeMessage |
boolean | true | Show welcome message |
behavior.welcomeMessage |
string | 'Hello! How can I help...' | Welcome message text |
behavior.enableVoiceMode |
boolean | true | Enable voice mode option (in unified mode) |
Animation, Prompt Bubble & Tooltips
| Property | Type | Default | Description |
|---|---|---|---|
animation.enableHover |
boolean | true | Enable hover animations |
animation.enablePulse |
boolean | true | Enable pulse animations |
animation.enableSlide |
boolean | true | Enable slide animations |
animation.duration |
number | 0.3 | Animation duration (seconds) |
promptAnimation.enabled |
boolean | false | Show an animated “Try me!” prompt bubble next to the launcher pill. Must be explicitly set to true. |
promptAnimation.text |
string | 'Try me!' | Prompt bubble label text |
promptAnimation.backgroundColor |
string | purple gradient | Prompt bubble background (CSS color or gradient) |
promptAnimation.textColor |
string | '#ffffff' | Prompt bubble text color |
promptAnimation.animationType |
string | 'bounce' | 'bounce', 'pulse', 'float', or 'none' |
promptAnimation.showShimmer |
boolean | true | Shimmer effect on the prompt bubble |
promptAnimation.showPulseRings |
boolean | true | Pulse rings around the launcher while the prompt is visible |
promptAnimation.hideAfterClick |
boolean | true | Hide the prompt after the user opens the widget |
promptAnimation.hideAfterSeconds |
number | null | null | Auto-hide after N seconds. null = never auto-hide. |
promptAnimation.position |
string | 'top' | Prompt placement relative to the launcher: 'top', 'left', or 'right' |
tooltips.newChat |
string | Auto | New chat button tooltip |
tooltips.back |
string | Auto | Back button tooltip |
tooltips.close |
string | Auto | Close button tooltip |
tooltips.mute |
string | Auto | Mute button tooltip |
tooltips.speaker |
string | Auto | Speaker button tooltip |
tooltips.endCall |
string | Auto | End call button tooltip |
Mobile pre-chat overlay & legacy landing keys (Unified Mode)
On desktop, unified mode opens the voice idle hero inside the panel—there is no in-panel “Voice / Text” mode card screen. Back navigation and call end return to that hero. On mobile, the full-screen pre-chat sheet (ttpMobileLanding) still offers call vs text; the following options apply there. During an active call, the bottom minimized bar stays visible first; tapping the transcript row opens a conversation sheet (type while voice is active). Close the sheet to return to the bar—mute, speaker, and end call remain on the bar when the sheet is closed. On mobile, Back (header or text bar) closes the panel and opens the Call vs Chat landing overlay directly—the same sheet as tapping the FAB. If the user declines the server-driven disclaimer during voice setup, the widget also returns to that Call vs Chat overlay (and disconnects) instead of leaving the in-panel “Start call” card. After accepting the disclaimer, the minimized voice bar (waveform / mic / end call) stays the default view; the expandable conversation sheet with the text field does not open automatically. Other landing.* keys (e.g. mode card colors, in-panel title) remain in config merges for backward compatibility but are not rendered on desktop.
| Property | Type | Default | Description |
|---|---|---|---|
landing.voiceCardTitle |
string | null | Fallback label for the mobile “call” action when landing.callButtonText is not set |
landing.textCardTitle |
string | null | Fallback label for the mobile “chat” action when landing.chatButtonText is not set |
landing.callButtonText |
string | null | Mobile overlay primary button text (falls back to voiceCardTitle / translation) |
landing.chatButtonText |
string | null | Mobile overlay secondary button text (falls back to textCardTitle / translation) |
landing.statusText |
string | null | Mobile overlay status line; when unset, uses online translation plus landing.subtitle (e.g. “Ready to help”) |
landing.subtitle |
string | — | Shown after the online dot in the mobile overlay status when landing.statusText is not set |
Advanced Configuration
| Property | Type | Default | Description |
|---|---|---|---|
websocketUrl |
string | Optional | Custom WebSocket base URL (defaults to wss://speech.talktopc.com/ws/conv). If not provided, URL is constructed from agentId/appId. |
demo |
boolean | true | Enable demo mode |
panel.backdropFilter |
string | null | CSS backdrop filter (e.g., 'blur(10px)') |
panel.border |
string | '1px solid rgba(0,0,0,0.1)' | Panel border style |
button.shadowColor |
string | 'rgba(0,0,0,0.15)' | Button shadow color |
icon.emoji |
string | '🎤' | Emoji when icon.type = 'emoji' |
icon.text |
string | 'AI' | Text when icon.type = 'text' |
accessibility.ariaLabel |
string | 'Chat Assistant' | ARIA label for the widget |
accessibility.ariaDescription |
string | 'Click to open chat assistant' | ARIA description |
accessibility.keyboardNavigation |
boolean | true | Enable keyboard navigation |
onConversationStart |
function | — | Called when a voice conversation starts (after connect + recording begins). |
onConversationEnd |
function | — | Called when a voice conversation ends (including disclaimer decline before recording). |
onBargeIn |
function | — | Called when the user interrupts the agent (barge-in). |
onAudioStartPlaying |
function | — | Called when agent audio playback starts. |
onAudioStoppedPlaying |
function | — | Called when agent audio playback stops. |
onSubtitleDisplay |
function | — | Called when a subtitle/transcript line is displayed during a call. |
onVoiceCallButtonClick |
function | — | Called when the user taps “Start Voice Call”. Return false to cancel the call start. |
Visual Assistant (visualAssistant)
When enabled, the SDK registers browser-side tools the agent can call to read the page, highlight elements, scroll, navigate, fill forms, click elements, and capture screenshots. Tools are always registered on the client; the backend agent config decides which are available.
visualAssistant: {
enabled: true, // Required to activate visual assistant
allowHighlight: true, // highlight_element
allowScroll: true, // scroll_to_element
allowNavigate: true, // navigate_to
allowFillForm: true, // fill_form
allowClick: true // click_element
}
May be set at the widget root or inside agentSettingsOverride.visualAssistant. See examples/test-client-tools.html for a live demo.
✅ Configuration Verified:
All configuration options listed above have been verified against the source code and are fully supported. The widget has 100+ customization options across 15+ categories:
- General (incl.
visualAssistant,whatsapp,inputFormat) - Positioning (incl.
draggable,draggablePersist) - Icon & Button
- Panel & Header (incl. online indicator, footer colors)
- Messages & Text Chat
- Voice Interface & output format
- Voice Theming & Pill Launcher
- Mobile landing / legacy
landing - Prompt bubble (
promptAnimation) - Tooltips & Animations
- Behavior (incl.
mobileVoiceUI) - Accessibility
- Event callbacks (voice lifecycle)
- Advanced & legacy keys
- Visual Assistant
Experiment with all options interactively in the live demo.
📖 Tip: All configuration options support the spread operator (...), so you can pass additional custom properties that will be merged with defaults.
Use Cases
💼 Customer Support
Add 24/7 AI-powered support to your website
🛒 E-commerce
Help customers find products and answer questions
📚 Documentation
Provide interactive help for your docs
🎓 Education
Create AI tutors and learning assistants
Vanilla JavaScript Guide
Use the SDK in any JavaScript application without frameworks.
Complete Example
import { VoiceSDK } from 'ttp-agent-sdk';
class VoiceAssistant {
constructor() {
this.sdk = null;
this.isConnected = false;
this.isRecording = false;
}
async initialize(agentId, overrides = {}) {
this.sdk = new VoiceSDK({
agentId: agentId,
appId: 'your_app_id',
agentSettingsOverride: overrides
});
// Setup event listeners
this.setupEventListeners();
// Connect
await this.sdk.connect();
}
setupEventListeners() {
this.sdk.on('connected', () => {
this.isConnected = true;
this.updateUI('connected');
});
this.sdk.on('disconnected', () => {
this.isConnected = false;
this.isRecording = false;
this.updateUI('disconnected');
});
this.sdk.on('recordingStarted', () => {
this.isRecording = true;
this.updateUI('recording');
});
this.sdk.on('recordingStopped', () => {
this.isRecording = false;
this.updateUI('connected');
});
this.sdk.on('message', (msg) => {
if (msg.type === 'agent_response') {
this.displayMessage('agent', msg.agent_response);
}
});
this.sdk.on('error', (error) => {
this.handleError(error);
});
}
async toggleRecording() {
if (!this.isConnected) return;
if (this.isRecording) {
await this.sdk.stopRecording();
} else {
await this.sdk.startRecording();
}
}
disconnect() {
if (this.sdk) {
this.sdk.disconnect();
this.sdk = null;
}
}
updateUI(state) {
// Update your UI based on state
}
displayMessage(role, text) {
// Display message in your UI
}
handleError(error) {
// Handle errors
}
}
// Usage
const assistant = new VoiceAssistant();
await assistant.initialize('agent_123', {
language: 'es',
temperature: 0.9
});
React Integration
Use the SDK in React applications with hooks and components.
Using Hooks
import React, { useState, useEffect, useRef } from 'react';
import { VoiceSDK } from 'ttp-agent-sdk';
function VoiceChat() {
const [status, setStatus] = useState('disconnected');
const [isRecording, setIsRecording] = useState(false);
const [messages, setMessages] = useState([]);
const sdkRef = useRef(null);
// Initialize SDK
useEffect(() => {
async function initSDK() {
const sdk = new VoiceSDK({
agentId: 'agent_123',
appId: 'your_app_id',
agentSettingsOverride: {
language: 'es',
temperature: 0.9
}
});
// Event listeners
sdk.on('connected', () => setStatus('connected'));
sdk.on('disconnected', () => {
setStatus('disconnected');
setIsRecording(false);
});
sdk.on('recordingStarted', () => setIsRecording(true));
sdk.on('recordingStopped', () => setIsRecording(false));
sdk.on('message', (msg) => {
if (msg.type === 'agent_response') {
setMessages(prev => [
...prev,
{ role: 'agent', text: msg.agent_response }
]);
}
});
await sdk.connect();
sdkRef.current = sdk;
}
initSDK();
// Cleanup
return () => {
if (sdkRef.current) {
sdkRef.current.disconnect();
}
};
}, []);
const toggleRecording = async () => {
if (sdkRef.current) {
await sdkRef.current.toggleRecording();
}
};
return (
<div>
<div>Status: {status}</div>
<button onClick={toggleRecording} disabled={status !== 'connected'}>
{isRecording ? 'Stop' : 'Start'} Recording
</button>
<div>
{messages.map((msg, i) => (
<div key={i}>{msg.role}: {msg.text}</div>
))}
</div>
</div>
);
}
export default VoiceChat;
Fly-to-Cart Animation
When users tap "Add to Cart" in the e-commerce widget, the product card flies down into the cart icon with a tornado funnel effect. Uses html2canvas to capture the card as an image, avoiding stacking/overflow clipping in iframes or containers with overflow:hidden. Built-in for TTPEcommerceWidget. For custom React product widgets, use the useFlyToCart hook.
React Hook (useFlyToCart)
import { useFlyToCart } from 'ttp-agent-sdk';
const cartIconRef = useRef(null);
const { triggerFly, isAnimating } = useFlyToCart(cartIconRef);
const handleAddToCart = (product) => {
triggerFly(cardRef.current, product, () => {
addToCartAPI(product);
});
};
Vanilla JS (FlyToCart)
import { FlyToCart } from 'ttp-agent-sdk';
const flyToCart = new FlyToCart({
getCartTarget: () => document.querySelector('.cart-icon'),
onCartBump: () => { /* optional */ },
});
flyToCart.triggerFly(cardElement, product, () => {
addToCartAPI(product);
});
E-commerce: Adding to cart (partner API & front-end cart)
Cart logic is owned by your integration (partner APIs, mock stores, or a Shopify storefront). The widget renders products, sends user intent over the WebSocket, and updates the cart summary when it receives the right messages. The SDK does not keep its own authoritative cart array: totals and counts should reflect what the backend (or the browser cart, then the backend) considers true.
Who does the "add"?
| Model | Where the line item is added | How the widget learns the new totals |
|---|---|---|
| Server-side partner cart (mock store, custom API, headless stack) | Conversation backend calls the partner API after product_selected or after the model runs the internal tool add_to_cart. |
Backend sends t: "cart_updated" with cartTotal, cartItemCount, and optional currency. For a visible "added" toast, include action: "added" and a product object (id, name, price). |
| Shopify Online Store (widget embedded on the store theme) | The browser must call Shopify's Ajax Cart API (POST /cart/add.js) so the add uses the visitor's store session cookie. Server-only Storefront API carts are a different session and will not match the theme cart. |
The conversation backend sends t: "add_to_store_cart" with variantId and quantity. The ecommerce flavor runs the Ajax calls, refreshes the cart bar from GET /cart.js, and sends t: "cart_add_result" back so the backend can continue the turn. The backend may still send cart_updated afterward if you want one canonical sync message. |
| Custom client tool | Your backend issues a client_tool_call with a tool name you defined (for example addToCart). Your page handles it via registerToolHandler on the widget or AgentSDK. |
The SDK replies with client_tool_result (or client_tool_error). The cart bar still expects cart_updated unless your handler updates UI itself. |
Flow 1 — User taps Add / Update on a product card
- The widget stops playback (so the agent does not talk over the action) and sends
t: "product_selected"on the voice WebSocket. Payload includesproductId,productName,price,quantity(absolute units or weight), andsellBy(quantityorweight). - Your backend decides what happens next:
- Update the partner cart via API, then push
cart_updatedto the session, or - For Shopify on-domain, send
add_to_store_cartand let the widget perform/cart/add.js, then consumecart_add_resulton the server, or - Hybrid: tool or internal step on the server plus a follow-up
cart_updated.
- Update the partner cart via API, then push
EcommerceManager.handleCartUpdatedupdates the bottom cart bar whencartTotalandcartItemCountare present. Optionalaction: "added"drives the short confirmation UX.
Flow 2 — User asks in voice (or the agent adds without a card click)
For TTP's Java conversation backend, e-commerce uses internal tools such as search_products, add_to_cart, and get_cart (names match what the LLM sees; they are not the same as optional SDK client_tool_call names unless you wire them).
- The model calls
add_to_cartwithproductIdandquantity. - The server runs the partner integration:
- Partner API cart: The service calls the partner, receives success and line items, then typically sends
cart_updatedto the widget with totals and item count so the UI matches the server. - Shopify theme cart: The service sends
add_to_store_cartto the widget instead of mutating cart only on the server; the widget performs the Ajax add and returnscart_add_result. Reading the cart may useget_store_cartfrom the server, which the widget answers withcart_state_resultafterGET /cart.js.
- Partner API cart: The service calls the partner, receives success and line items, then typically sends
- The model receives a text summary of the tool result and can speak to the user.
Message cheat sheet (widget ↔ backend)
| Direction | t / type |
Role |
|---|---|---|
| Widget → backend | product_selected |
User confirmed quantity and tapped Add/Update on a card. |
| Backend → widget | show_products / show_items |
Render product cards (search or browse results). |
| Backend → widget | cart_updated |
Sync cartTotal, cartItemCount, optional currency, optional action + product for toast. |
| Backend → widget | add_to_store_cart |
Shopify: widget runs Ajax add for variantId / quantity. Optional verbalAck: true when the add was triggered by the user tapping Add (not an agent tool); echoed on cart_add_result so the server can prompt a brief spoken acknowledgment. |
| Widget → backend | cart_add_result |
Outcome of Ajax add (success, counts, totals, currency). Includes verbalAck when the originating add_to_store_cart or hook add_to_site_cart requested it (UI add). |
| Backend → widget | get_store_cart |
Shopify: widget fetches /cart.js and replies with cart contents. |
| Widget → backend | cart_state_result |
Async cart snapshot after get_store_cart (implementation-specific). |
| Backend → widget | client_tool_call |
Optional: run custom logic in the page; reply with client_tool_result. |
Custom client tools: Register handlers with registerToolHandler on the widget or AgentSDK so the toolName in each client_tool_call matches what your backend defines.
Widget Flavors
The TTP SDK supports domain-specific "flavors" that customize the widget experience for different verticals. Each flavor provides specialized UI components, backend tools, and message handling.
Available Flavors
| Flavor Type | Partner IDs | Backend Tools | Key Features |
|---|---|---|---|
ecommerce |
mock-store, shopify |
search_products, add_to_cart, get_cart |
Product cards, cart bar, fly-to-cart animation |
hotels |
mock-hotel |
search_rooms, select_room, add_extra, get_booking, show_media |
Room cards, booking bar, gallery |
pharma |
mock-pharm |
search_medications, add_to_prescription, get_prescription |
Medication cards (with Rx/OTC badges), prescription summary bar |
restaurants |
mock-restaurant |
search_menu, add_to_order, get_order, show_media |
Menu item cards (with allergen/dietary tags), order summary bar, gallery |
tours |
mock-tour |
search_tours, book_tour, get_tour_booking, show_media |
Tour item cards, booking summary bar, gallery |
E-commerce WebSocket messages: The widget listens for t: "show_items" and t: "show_products" (same handler; backend tools often send show_products with products, title, layout). Use TTPEcommerceWidget (or TTPChatWidget with flavor.type: 'ecommerce') so these handlers are registered. For end-to-end add-to-cart flows (partner API vs Shopify Ajax vs client tools), see E-commerce cart flows.
Desktop voice strip (flavor.callView: 'minimized'): When flavor.callView is 'minimized', a fixed bottom voice surface is used on viewports wider than 768px. This works with or without a flavor: it requires only a flavor object carrying callView: 'minimized' (e.g. flavor: { callView: 'minimized' }, no type/partner needed) — a flavor type is optional. It is a rounded floating dock with layered glass styling (rim light, ambient shadow, soft indigo outer glow), an inset transcript “well,” squircle control tiles, and a blurred LIVE capsule—same layout and controls as before. It is centered with a capped width (up to about 720px with horizontal inset) and bottom spacing plus env(safe-area-inset-bottom). It uses the same tokens as the floating pill (voice.pillGradient, pillTextColor, pillDotColor, endCallButtonColor). Mute, pause, speaker, keyboard (text inject), and end-call align with the in-widget voice UI. The in-panel desktop “active call” UI stays hidden; the floating panel collapses if it was open, and the launcher pill is hidden for the duration of the call. The “Powered by” footer (footer.show, footer.brand) is rendered inside the strip (below the call controls) for the duration of the call—not as a separate floating pill at the widget corner. Viewports 768px and below skip the strip and use the mobile minimized bar flow instead. A single transcript line shows either the assistant (streaming) or the user (speech-to-text interim/final), never both at once; user text uses a warm highlight color and no You: prefix. When the widget language is Hebrew or Arabic (base he / ar) or direction is 'rtl', chrome uses RTL. Transcript lines use explicit dir="rtl" when the string contains Hebrew or Arabic letters (even if the widget Language is English), so neutral punctuation follows the sentence; Latin-only lines use dir="ltr". Mic/pause/speaker/end layout is unchanged.
Configuration
Set the flavor when creating the widget:
const widget = new TTPChatWidget({
agentId: 'agent_...',
appId: 'app_...',
flavor: {
type: 'pharma', // 'ecommerce' | 'hotels' | 'pharma' | 'restaurants' | 'tours'
partnerId: 'mock-pharm' // partner-specific data source
}
});
Pharma Flavor
The pharmacy flavor provides a medication search and prescription management experience.
// Pharma configuration
flavor: {
type: 'pharma',
partnerId: 'mock-pharm'
}
// Backend tools injected automatically:
// - search_medications: Search the medication catalog
// - add_to_prescription: Add a medication to the prescription
// - get_prescription: View current prescription contents
// Frontend message types:
// - show_items: Displays medication cards
// - prescription_updated: Updates the prescription summary bar
The pharma flavor does not include gallery support (show_media).
Restaurants Flavor
The restaurants flavor provides a menu browsing, ordering, and photo gallery experience.
// Restaurant configuration
flavor: {
type: 'restaurants',
partnerId: 'mock-restaurant'
}
// Backend tools injected automatically:
// - search_menu: Search the restaurant menu
// - add_to_order: Add a menu item to the order
// - get_order: View current order contents
// - show_media: Display restaurant photo gallery
// Frontend message types:
// - show_items: Displays menu item cards
// - order_updated: Updates the order summary bar
// - show_media / dismiss_media: Gallery with dish photos, ambiance, etc.
Menu item cards display allergen warnings and dietary tags automatically.
Tours Flavor
The tours flavor provides a tour browsing, booking, and photo gallery experience.
// Tour configuration
flavor: {
type: 'tours',
partnerId: 'mock-tour'
}
// Backend tools injected automatically:
// - search_tours: Search available tours and activities by query
// - book_tour: Book a tour or activity
// - get_tour_booking: View current booking contents
// - show_media: Display tour photo gallery
// Frontend message types:
// - show_items: Displays tour item cards
// - cart_updated: Updates the booking summary bar
// - show_media / dismiss_media: Gallery with tour photos, highlights, etc.
Tour cards display activity tags and pricing. Reuses the same UI infrastructure as the restaurants flavor.
Client-Script Tools
Client-script tools let you attach backend-authored JavaScript to a client tool (tool_type: 'client') that runs in the visitor's browser when the LLM calls the tool — no host-page registerToolHandler() code required. The script is configured in the dashboard (tool form → "Agent scripts"), stored with the tool, and delivered to the widget automatically at session start. Available since SDK v2.45.3.
registerToolHandler() when the host page owns the logic and ships its own JS. Use a client-script tool when the agent owner wants browser-side behavior (DOM reads, page API calls, UI nudges) configurable from the dashboard without touching the embedding site.
How it works
- Session init — the backend pushes a
partner_bundlemessage with the reserved partner id__client_tools__, carrying every scripted tool attached to the agent (plus auto-run library scripts). The bundle is pushed on both the voice and text-chat channels, works with or without a widget flavor, and is re-pushed on reconnect (a new bundle replaces the previous one). - Compile on load — each entry compiles into a strict-mode async function immediately when the bundle lands. A syntax error poisons only that entry (logged at load time); the rest of the bundle still works.
- Invocation — when the LLM calls the tool, the SDK runs the compiled script and returns its result to the backend.
- Auto-run — library scripts flagged
auto_runexecute exactly once when the bundle arrives (e.g. to set up page listeners).
Authoring styles
Two styles compile — the SDK detects the shape and picks the right wrapping:
// 1. Raw statement body (typical in the dashboard UI; `ctx` is in scope)
alert("hi");
return { ok: true, clicked: true };
// 2. Function expression
async (ctx) => {
const res = await ctx.fetch('/api/something');
return { ok: true, data: await res.json() };
}
Both run inside an async function, so await is always allowed. A script that returns nothing resolves to { ok: true }.
The ctx object
| Property | Description |
|---|---|
ctx.args | Parameters from the LLM tool call. |
ctx.host | window.location.hostname. |
ctx.fetch | Bound window.fetch. |
ctx.log(...) | console.log prefixed with [adapter:<tool>]. |
ctx.runAdapter(action, args) | Invoke another script in the same bundle (max nesting depth 16). |
ctx.emit(msg) | Send an unsolicited JSON message back to the backend over the active channel. |
Client-tool scripts run with a generic context — the ecommerce-only helpers (ctx.platform, ctx.normalizeCart, ctx.refreshUI) are not available; those exist only in ecommerce partner-adapter bundles.
Pre / main / post steps
A scripted tool may carry up to three steps, executed in order: pre → main → post.
- Pre is best-effort — if it throws, the error is logged and main still runs.
- Main produces the tool result. A thrown error becomes an
ok: falseresult. - Post runs only when main succeeded; its
ctx.args._mainResultholds main's return value.
Sync vs async results
Configured per tool in the dashboard ("Wait for result"):
| Mode | Behavior |
|---|---|
| Wait ON (sync) | The backend awaits the script result up to timeoutMs (1000–15000 ms, default 8000) and feeds it to the LLM as the tool result. |
| Wait OFF (async) | The LLM immediately receives the configured pending message; the real result is injected later: SPOKEN_EVENT (spoken immediately), SPOKEN_DEFERRED (next natural turn, default), or SILENT_CONTEXT (silent context update). |
Limits
| Limit | Value |
|---|---|
| Per-step code size | 32 KB |
| Result size | 512 KB serialized (oversize → ok: false, error: "RESULT_TOO_LARGE") |
| Sync timeout | 1000–15000 ms (default 8000) |
| Pending message | 500 chars |
Wire protocol (reference)
Script invocations arrive on two wire forms, both handled by the SDK on both channels:
// Bundle (session init / reconnect)
{ "t": "partner_bundle", "partner_id": "__client_tools__",
"adapters": { "my_tool": { "code_js": "...", "pre_code_js": "...", "post_code_js": "..." } },
"autoRun": { "my_script": { "code_js": "..." } } }
// Form A — dedicated envelope (async path / chain steps)
→ { "t": "run_partner_script", "requestId": "...", "partnerId": "__client_tools__",
"action": "my_tool", "args": { ... } }
← { "t": "run_partner_script_result", "requestId": "...", "ok": true, "result": { ... } }
// Form B — client_tool_call piggyback (backend sync-await path)
→ { "t": "client_tool_call", "toolCallId": "...", "toolName": "run_partner_script",
"parameters": { "action": "my_tool", "args": { ... }, "partner_id": "__client_tools__" } }
← { "t": "client_tool_result", "toolCallId": "...", "result": { ... } }
Routing rule: partner_id === '__client_tools__' → the flavor-independent ClientScriptManager; any other partner id → the ecommerce flavor's partner-adapter path. The two bundles coexist.
Troubleshooting
| Symptom | Cause / fix |
|---|---|
compile failed ... SyntaxError at bundle load | Script doesn't parse. On SDK < 2.45.1, raw statement bodies (e.g. alert("stop");) failed even when valid — check the SDK version banner in the console and hard-refresh if stale. |
NO_HANDLER for run_partner_script | SDK < 2.45.3 didn't route the client_tool_call piggyback form without an ecommerce flavor. Fixed in 2.45.3. |
adapter_not_in_bundle | Tool not attached to the agent, or the bundle hasn't arrived yet. Check the session-start partner_bundle log. |
| Sync result is a timeout | Script exceeded timeoutMs. Raise it (max 15 s) or switch to async delivery. |
For the full internals reference see CLIENT_SCRIPT_TOOLS_GUIDE.md in the repository root.
Chain Tools
A chain tool (tool_type: 'chain') runs a graph of steps — server tools, client tools, agent switches, and library scripts — as a single LLM tool call. Chains are edited visually in the dashboard (Chain Editor: nodes are steps, edges are data-flow dependencies) and executed by the backend; the widget participates only when a step targets the browser.
Execution model
- Steps with no dependencies run first; steps at the same level run in parallel; levels run sequentially.
- A step's
input_fromlists the upstream steps whose outputs feed it. Dependent steps receive the original LLM arguments shallow-merged with each upstream result (a non-object result lands under the upstream step's id). - Fail-fast: the first failing step aborts the chain with
{ "success": false, "error": ..., "step": ... }. - The LLM receives the terminal step's result (multiple terminals merge keyed by step id).
- Inside a chain, every step is awaited — a referenced client tool's own "wait for result = off" setting is ignored.
Step types and the widget
| Step type | Runs where | Widget involvement |
|---|---|---|
server_tool | Backend (webhook) | None. |
switch_agent | Backend | None. |
client_tool (plain) | Browser | Arrives as a normal client_tool_call → your registerToolHandler() handler. |
client_tool (scripted) | Browser | Dispatched as a __client_tools__ bundle action keyed by tool name. |
client_script (library) | Browser | Dispatched as a __client_tools__ bundle action keyed by script:{scriptId}. |
Scripts and scripted client tools referenced by a chain are resolved into the session bundle automatically — they do not need to be attached to the agent.
Constraints
- 1–30 steps per chain; no cycles; chains cannot reference other chain tools (no recursion).
- Deleting a tool or library script that a chain references is blocked in the dashboard (the delete dialog lists the referencing chains).
VoiceSDK Class
Core class for voice interaction functionality. Server-driven disclaimer gating (compliance copy before STT/greeting) is described in detail under Server-driven disclaimer (voice) and applies when using protocol v2.
Constructor
new VoiceSDK(config)
Configuration Object
| Property | Type | Required | Description |
|---|---|---|---|
agentId |
string | Yes | The AI agent identifier to connect to |
appId |
string | Yes | Your application identifier |
websocketUrl |
string | No | Optional custom WebSocket base URL (defaults to wss://speech.talktopc.com/ws/conv) |
agentSettingsOverride |
object | No | Custom agent configuration |
voice |
string | No | Voice preset name (default: 'default') |
language |
string | No | Language code (default: 'en') |
sampleRate |
number | No | Input audio sample rate: 8000, 16000, 22050, 24000, 44100, or 48000 Hz (default: 16000) |
channels |
number | No | Input audio channels (default: 1, mono only) |
bitDepth |
number | No | Input audio bit depth: 8, 16, or 24 bits (default: 16) |
outputContainer |
string | No | Output container format: 'raw' or 'wav' (default: 'raw') |
outputEncoding |
string | No | Output audio encoding: 'pcm', 'pcmu' (μ-law), or 'pcma' (A-law) (default: 'pcm') |
outputSampleRate |
number | No | Output audio sample rate: 8000, 16000, 22050, 24000, 44100, or 48000 Hz (default: 24000) |
outputChannels |
number | No | Output audio channels (default: 1, mono only) |
outputBitDepth |
number | No | Output audio bit depth: 8, 16, or 24 bits (default: 16) |
outputFrameDurationMs |
number | No | Frame duration for raw PCM streaming in milliseconds (default: 600) |
protocolVersion |
number | No | Protocol version: 1 (legacy) or 2 (format negotiation) (default: 2) |
autoReconnect |
boolean | No | Auto-reconnect on disconnect (default: true) |
Audio Format Configuration (v2 Protocol)
The SDK v2 supports format negotiation with the backend. You can specify both input and output audio formats:
- Input Encodings: PCM, PCMU (μ-law), PCMA (A-law)
- Input Sample Rates: 8000, 16000, 22050, 24000, 44100, 48000 Hz
- Input Bit Depths: 8, 16, 24 bits
- Output Containers: 'raw' (no header) or 'wav' (with header)
- Output Encodings: PCM, PCMU (μ-law), PCMA (A-law)
- Output Sample Rates: 8000, 16000, 22050, 24000, 44100, 48000 Hz
- Output Bit Depths: 8, 16, 24 bits
Example: Custom Audio Format
const voiceSDK = new VoiceSDK({
agentId: 'agent_123',
appId: 'your_app_id',
// Input format (what we send to server)
sampleRate: 16000,
channels: 1,
bitDepth: 16,
// Output format (what we want from server)
outputContainer: 'raw', // 'raw' or 'wav'
outputEncoding: 'pcm', // 'pcm', 'pcmu', 'pcma'
outputSampleRate: 24000, // Default; typical TTS/server output
outputChannels: 1,
outputBitDepth: 16,
outputFrameDurationMs: 600, // Frame duration for streaming
// Protocol version
protocolVersion: 2 // Use v2 protocol for format negotiation
});
// Listen for format negotiation
voiceSDK.on('formatNegotiated', (format) => {
console.log('Format negotiated:', format);
// format contains: { container, encoding, sampleRate, channels, bitDepth }
});
Methods
connect()
Connect to the voice agent.
Returns: Promise<boolean>
await voiceSDK.connect();
disconnect()
Disconnect from the voice agent.
Returns: void
voiceSDK.disconnect();
startRecording()
Start capturing and streaming audio.
Disclaimer (v2): If the server required a disclaimer and disclaimersPending is still true, this throws an Error with error.code === 'DISCLAIMER_PENDING'. Call sendDisclaimerAck(true) first. See Server-driven disclaimer.
Returns: Promise<boolean>
await voiceSDK.startRecording();
sendDisclaimerAck(accepted) (v2, voice)
Sends disclaimer_ack after the user accepts or declines server-shown disclaimer text. Uses disclaimersHash and conversationId from the last hello_ack.
Parameters: accepted (boolean) — true to continue the call, false to decline.
Returns: void (logs and clears disclaimersPending on successful send).
voiceSDK.sendDisclaimerAck(true);
stopRecording()
Stop capturing audio.
Returns: Promise<boolean>
await voiceSDK.stopRecording();
toggleRecording()
Toggle recording state (start/stop).
Returns: Promise<boolean>
await voiceSDK.toggleRecording();
pauseCall()
Pause the active call. Stops sending audio, clears the playback queue, and notifies the server to close STT/TTS connections. The WebSocket stays open and conversation history is preserved. A configurable timeout (default 5 minutes) will automatically end the call if not resumed.
Returns: void
voiceSDK.pauseCall();
// or via AgentSDK:
agentSDK.pauseCall();
resumeCall()
Resume a paused call. Restarts audio recording and notifies the server to re-create STT/TTS connections. The conversation continues from where it left off with full history.
Returns: void
voiceSDK.resumeCall();
// or via AgentSDK:
agentSDK.resumeCall();
isPaused (property)
Boolean indicating whether the call is currently paused.
Type: boolean
if (voiceSDK.isPaused) {
console.log('Call is paused');
}
// or via AgentSDK:
if (agentSDK.isPaused) {
console.log('Call is paused');
}
getStatus()
Get current connection and recording status.
Returns: Object
const status = voiceSDK.getStatus();
// Returns: {
// version: '2.0.0',
// isConnected: boolean,
// isRecording: boolean,
// isPlaying: boolean,
// outputFormat: object, // Negotiated output format (v2)
// audioPlayer: object, // AudioPlayer status
// audioRecorder: object // AudioRecorder status
// }
validateInputFormat(format)
v2 only: Validate input audio format configuration.
Parameters:
format(object) - Format object with encoding, sampleRate, bitDepth, channels
Returns: string|null - Error message if invalid, null if valid
const error = voiceSDK.validateInputFormat({
encoding: 'pcm',
sampleRate: 16000,
bitDepth: 16,
channels: 1
});
if (error) {
console.error('Invalid format:', error);
}
validateOutputFormat(format)
v2 only: Validate output audio format configuration.
Parameters:
format(object) - Format object with container, encoding, sampleRate, bitDepth, channels
Returns: string|null - Error message if invalid, null if valid
const error = voiceSDK.validateOutputFormat({
container: 'raw',
encoding: 'pcm',
sampleRate: 24000,
bitDepth: 16,
channels: 1
});
if (error) {
console.error('Invalid format:', error);
}
updateConfig(newConfig)
Update SDK configuration dynamically.
Parameters:
newConfig(object) - Partial configuration object to merge with existing config
Returns: void
voiceSDK.updateConfig({
outputSampleRate: 48000,
outputEncoding: 'pcmu'
});
reconnect()
Manually reconnect to the agent.
Returns: Promise<boolean>
await voiceSDK.reconnect();
stopAudioPlayback()
Immediately stop audio playback (for barge-in).
Returns: void
voiceSDK.stopAudioPlayback();
on(event, callback)
Register an event listener.
Parameters:
event(string) - Event namecallback(function) - Event handler
voiceSDK.on('connected', () => {
console.log('Connected!');
});
destroy()
Cleanup all resources and disconnect.
Returns: void
voiceSDK.destroy();
Events Reference
| Event | Parameters | Description |
|---|---|---|
connected |
- | Emitted when successfully connected |
disconnected |
event |
Emitted when disconnected (includes reason) |
error |
error |
Emitted on errors |
recordingStarted |
- | Emitted when recording starts |
recordingStopped |
- | Emitted when recording stops |
message |
message |
Emitted for all WebSocket messages |
playbackStarted |
- | Emitted when audio playback starts |
playbackStopped |
- | Emitted when audio playback stops |
playbackError |
error |
Emitted on audio playback errors |
bargeIn |
message |
Emitted when user interrupts agent |
stopPlaying |
message |
Emitted when server requests to stop audio |
formatNegotiated |
format |
v2 only: Emitted when audio format is negotiated with server. Format object contains: container, encoding, sampleRate, channels, bitDepth |
greetingStarted |
- | Emitted when greeting audio starts |
domainError |
error |
Emitted when domain is not whitelisted |
disclaimersRequired |
payload |
v2 voice: Server requires disclaimer acknowledgement. Payload: texts, disclaimersHash, disclaimerTimeoutMs, conversationId. Call sendDisclaimerAck after user decision. |
disclaimerRejected |
{ code, message } |
v2 voice: Terminal disclaimer failure from server (DISCLAIMER_DECLINED, DISCLAIMER_TIMEOUT, DISCLAIMER_HASH_MISMATCH). Not used for DISCLAIMER_PENDING (that uses error). |
Configuration Options
Agent Settings Override
Complete reference for all overridable settings:
Core Settings
| Setting | Type | Range/Values | Description |
|---|---|---|---|
prompt |
string | Any text | System prompt/instructions for the agent |
temperature |
number | 0.0 - 2.0 | LLM creativity level |
maxTokens |
number | 1 - 4096 | Maximum tokens per response |
model |
string | Model names | ⚠️ NOT SUPPORTED - LLM model selection requires infrastructure changes |
language |
string | ISO codes | Response language (e.g., 'en', 'es', 'fr') |
Voice Settings
| Setting | Type | Range/Values | Description |
|---|---|---|---|
voiceId |
string | Voice IDs | Specific voice identifier |
voiceSpeed |
number | 0.5 - 2.0 | Voice speed multiplier |
Behavior Settings
| Setting | Type | Range/Values | Description |
|---|---|---|---|
firstMessage |
string | Any text | Initial greeting message |
disableInterruptions |
boolean | true/false | Prevent user from interrupting agent |
autoDetectLanguage |
boolean | true/false | Automatically detect user's language |
candidateLanguages |
array | Language codes | List of candidate languages for auto-detection (e.g., ['en', 'es', 'fr']) |
maxCallDuration |
number | Seconds | Maximum session duration |
Advanced Settings
| Setting | Type | Range/Values | Description |
|---|---|---|---|
toolIds |
array | Array of numbers | Array of custom tool IDs to enable for this agent (e.g., [123, 456, 789]) |
internalToolIds |
array | Array of strings | Array of internal tool IDs to enable for this agent (e.g., ['calendar', 'weather', 'email']) |
timezone |
string | TZ names | User timezone (e.g., 'America/New_York') |
Text-to-Speech API
PUBLIC REST API
Generate high-quality voice audio from text using our public REST API endpoint.
Endpoint
POST https://backend.talktopc.com/api/public/agents/tts/generate
Authentication
Include your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
text |
string | Yes | The text to convert to speech |
voiceId |
string | No | Voice identifier (default: agent's configured voice) |
voiceSpeed |
number | No | Voice speed multiplier: 0.5 - 2.0 (default: 1.0) |
language |
string | No | Language code (e.g., 'en', 'es', 'fr') |
agentId |
string | No | Agent ID to use voice settings from |
Example Requests
curl -X POST https://backend.talktopc.com/api/public/agents/tts/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"text": "Hello! Welcome to our service.",
"voiceId": "nova",
"voiceSpeed": 1.2,
"language": "en"
}' \
--output speech.mp3
const response = await fetch('https://backend.talktopc.com/api/public/agents/tts/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.TTP_API_KEY}`
},
body: JSON.stringify({
text: 'Hello! Welcome to our service.',
voiceId: 'nova',
voiceSpeed: 1.2,
language: 'en'
})
});
// Response is audio file
const audioBuffer = await response.arrayBuffer();
const audioBlob = new Blob([audioBuffer], { type: 'audio/mpeg' });
// Play or save the audio
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();
import requests
import os
url = "https://backend.talktopc.com/api/public/agents/tts/generate"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.environ['TTP_API_KEY']}"
}
data = {
"text": "Hello! Welcome to our service.",
"voiceId": "nova",
"voiceSpeed": 1.2,
"language": "en"
}
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
# Save audio to file
with open("speech.mp3", "wb") as f:
f.write(response.content)
print("Audio saved to speech.mp3")
else:
print(f"Error: {response.status_code}")
<?php
$url = "https://backend.talktopc.com/api/public/agents/tts/generate";
$apiKey = getenv('TTP_API_KEY');
$data = [
'text' => 'Hello! Welcome to our service.',
'voiceId' => 'nova',
'voiceSpeed' => 1.2,
'language' => 'en'
];
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
'Content-Type: application/json',
'Authorization: Bearer ' . $apiKey
]);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($httpCode === 200) {
file_put_contents('speech.mp3', $response);
echo "Audio saved to speech.mp3";
} else {
echo "Error: HTTP $httpCode";
}
?>
import java.net.http.*;
import java.net.URI;
import java.nio.file.*;
public class TTSExample {
public static void main(String[] args) throws Exception {
String url = "https://backend.talktopc.com/api/public/agents/tts/generate";
String apiKey = System.getenv("TTP_API_KEY");
String json = """
{
"text": "Hello! Welcome to our service.",
"voiceId": "nova",
"voiceSpeed": 1.2,
"language": "en"
}
""";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(url))
.header("Content-Type", "application/json")
.header("Authorization", "Bearer " + apiKey)
.POST(HttpRequest.BodyPublishers.ofString(json))
.build();
HttpResponse<byte[]> response = client.send(
request,
HttpResponse.BodyHandlers.ofByteArray()
);
if (response.statusCode() == 200) {
Files.write(Paths.get("speech.mp3"), response.body());
System.out.println("Audio saved to speech.mp3");
} else {
System.out.println("Error: " + response.statusCode());
}
}
}
Response
The endpoint returns audio data directly with the following headers:
| Header | Value | Description |
|---|---|---|
Content-Type |
audio/mpeg | Audio format (MP3) |
Content-Length |
number | Size of audio file in bytes |
Voice Speed Examples
| Speed | Effect | Use Case |
|---|---|---|
0.5 |
50% slower (half speed) | Educational content, accessibility |
0.75 |
25% slower | Clear pronunciation, language learning |
1.0 |
Normal speed (default) | Standard conversation |
1.2 |
20% faster | Quick updates, notifications |
1.5 |
50% faster | Rapid information delivery |
2.0 |
2x speed (double speed) | Maximum speed, time-saving |
Backend Implementation Example
// Your backend endpoint
app.post('/api/generate-speech', async (req, res) => {
const { text, voiceSpeed = 1.0 } = req.body;
// Call TTP TTS API
const ttpResponse = await fetch('https://backend.talktopc.com/api/public/agents/tts/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.TTP_API_KEY}` // 🔒 Secret!
},
body: JSON.stringify({
text: text,
voiceSpeed: voiceSpeed,
voiceId: 'nova',
language: 'en'
})
});
if (!ttpResponse.ok) {
return res.status(ttpResponse.status).json({
error: 'TTS generation failed'
});
}
// Forward audio to client
const audioBuffer = await ttpResponse.arrayBuffer();
res.set('Content-Type', 'audio/mpeg');
res.send(Buffer.from(audioBuffer));
});
Error Responses
| Status Code | Description |
|---|---|
400 |
Bad Request - Invalid parameters |
401 |
Unauthorized - Invalid or missing API key |
429 |
Too Many Requests - Rate limit exceeded |
500 |
Internal Server Error - TTS generation failed |
- Never expose your API key in frontend JavaScript
- Always call this endpoint from your backend server
- Implement rate limiting on your backend
- Validate and sanitize text input to prevent abuse
Use Cases
📢 Announcements
Generate audio announcements for notifications
📚 Content Creation
Convert articles or books to audio format
♿ Accessibility
Provide audio alternatives for text content
🎓 E-Learning
Create voice-overs for educational materials
Java SDK
Server-side Java SDK for text-to-speech conversion. Perfect for backend applications, phone systems, and server-to-server integrations.
- Backend TTS: Generate speech on your server without exposing API keys
- Phone Systems: Integrate with Twilio, Telnyx, or custom VoIP systems
- Server-to-Server: Automated voice generation for notifications, alerts, or content
- Audio Format Control: Request specific formats (PCMU, PCMA, PCM) for phone systems
Installation
Maven
<repositories>
<repository>
<id>github</id>
<url>https://maven.pkg.github.com/TTP-GO/java-sdk</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>com.talktopc</groupId>
<artifactId>ttp-agent-sdk-java</artifactId>
<version>1.0.5</version>
</dependency>
</dependencies>
You'll need to authenticate with GitHub Packages. Add credentials to your ~/.m2/settings.xml:
<settings>
<servers>
<server>
<id>github</id>
<username>YOUR_GITHUB_USERNAME</username>
<password>YOUR_GITHUB_TOKEN</password>
</server>
</servers>
</settings>
Gradle
repositories {
maven {
url = uri("https://maven.pkg.github.com/TTP-GO/java-sdk")
credentials {
username = project.findProperty("gpr.user") ?: System.getenv("USERNAME")
password = project.findProperty("gpr.key") ?: System.getenv("TOKEN")
}
}
}
dependencies {
implementation 'com.talktopc:ttp-agent-sdk-java:1.0.5'
}
Quick Start
1. Initialize SDK
import com.talktopc.sdk.VoiceSDK;
// Get API key from environment variable
String apiKey = System.getenv("TALKTOPC_API_KEY");
VoiceSDK sdk = VoiceSDK.builder()
.apiKey(apiKey)
.baseUrl("https://speech.talktopc.com") // Optional
.build();
2. Simple TTS (Blocking)
// Generate complete audio file
byte[] audio = sdk.textToSpeech("Hello world", "mamre");
// Save to file
Files.write(Paths.get("output.wav"), audio);
// Or send to phone system
phoneSystem.playAudio(audio);
3. Streaming TTS (Real-time)
// Stream audio chunks as they're generated
sdk.textToSpeechStream(
"Hello world, this is a longer text that will be streamed",
"mamre",
audioChunk -> {
// Receive chunks in real-time
phoneSystem.playAudio(audioChunk);
}
);
API Reference
VoiceSDK
Main SDK entry point for text-to-speech operations.
Builder Methods
| Method | Type | Description |
|---|---|---|
apiKey(String) |
String | Your TalkToPC API key (required) |
baseUrl(String) |
String | API base URL (default: https://speech.talktopc.com) |
connectTimeout(int) |
int | Connection timeout in milliseconds (default: 30000) |
readTimeout(int) |
int | Read timeout in milliseconds (default: 60000) |
Methods
| Method | Description |
|---|---|
textToSpeech(String text, String voiceId) |
Simple TTS (blocking) - returns complete audio as byte array |
textToSpeech(String text, String voiceId, double speed) |
TTS with speed control (0.1 - 3.0) |
textToSpeech(TTSRequest request) |
TTS with full configuration (format, speed, etc.) |
synthesize(TTSRequest request) |
Get full response with metadata (sample rate, duration, credits) |
textToSpeechStream(String text, String voiceId, Consumer<byte[]> chunkHandler) |
Streaming TTS - chunks delivered to handler as they're generated |
textToSpeechStream(TTSRequest request, Consumer<byte[]> chunkHandler, Consumer<StreamMetadata> onComplete, Consumer<Throwable> onError) |
Streaming TTS with completion and error callbacks |
TTSRequest Builder
Configure TTS requests with audio format options.
Basic Configuration
TTSRequest request = TTSRequest.builder()
.text("Hello world") // Required
.voiceId("mamre") // Required
.speed(1.0) // Optional (0.1 - 3.0)
.build();
Audio Format Configuration
TTSRequest request = TTSRequest.builder()
.text("Hello world")
.voiceId("mamre")
.outputContainer("raw") // "raw" or "wav"
.outputEncoding("pcm") // "pcm", "pcmu", "pcma"
.outputSampleRate(24000) // Hz (8000, 16000, 22050, 24000, 44100, 48000)
.outputBitDepth(16) // bits (8, 16, 24)
.outputChannels(1) // 1 (mono) or 2 (stereo)
.outputFrameDurationMs(600) // ms per frame (for streaming)
.build();
Preset Methods
| Method | Format | Use Case |
|---|---|---|
phoneSystem() |
PCMU @ 8kHz, 20ms frames | Phone systems (Twilio, Telnyx, VoIP) |
highQuality() |
WAV @ 44.1kHz | High-quality audio files |
standardQuality() |
PCM @ 22.05kHz | Standard quality audio |
TTSResponse
Response object containing audio and metadata.
| Method | Return Type | Description |
|---|---|---|
getAudio() |
byte[] | Audio data |
getSampleRate() |
int | Sample rate in Hz |
getDurationMs() |
long | Playback duration in milliseconds |
getAudioSizeBytes() |
long | Audio size in bytes |
getCreditsUsed() |
double | Credits consumed |
getConversationId() |
String | Unique conversation ID |
Examples
Basic Usage
VoiceSDK sdk = VoiceSDK.builder()
.apiKey(System.getenv("TALKTOPC_API_KEY"))
.build();
// Simple TTS
byte[] audio = sdk.textToSpeech("Welcome to TalkToPC", "mamre");
System.out.println("Generated " + audio.length + " bytes of audio");
// Save to file
Files.write(Paths.get("output.wav"), audio);
With Speed Control
// Faster speech (1.5x speed)
byte[] fastAudio = sdk.textToSpeech("Quick message", "mamre", 1.5);
// Slower speech (0.8x speed)
byte[] slowAudio = sdk.textToSpeech("Slow and clear", "mamre", 0.8);
Streaming with Metadata
sdk.textToSpeechStream(
TTSRequest.builder()
.text("Streaming example with full configuration")
.voiceId("mamre")
.speed(1.0)
.build(),
audioChunk -> {
// Handle each audio chunk
System.out.println("Received chunk: " + audioChunk.length + " bytes");
phoneSystem.playAudio(audioChunk);
},
metadata -> {
// Handle completion
System.out.println("Stream completed:");
System.out.println(" Total chunks: " + metadata.getTotalChunks());
System.out.println(" Total bytes: " + metadata.getTotalBytes());
System.out.println(" Duration: " + metadata.getDurationMs() + " ms");
System.out.println(" Credits: " + metadata.getCreditsUsed());
},
error -> {
// Handle errors
System.err.println("Stream error: " + error.getMessage());
}
);
Phone System Integration
Perfect for Twilio, Telnyx, or custom VoIP systems.
Standard Phone System (PCMU @ 8kHz)
// Using convenient phoneSystem() preset
TTSRequest request = TTSRequest.builder()
.text("Hello, thank you for calling. How can I help you today?")
.voiceId("en-US-female")
.phoneSystem() // ✅ PCMU @ 8kHz, 20ms frames
.build();
sdk.textToSpeechStream(
request,
audioChunk -> {
// audioChunk is PCMU @ 8kHz, 20ms frames (160 bytes)
// Ready to send directly to phone connection
phoneConnection.sendAudio(audioChunk);
}
);
Twilio Integration
TTSRequest request = TTSRequest.builder()
.text("Your appointment is confirmed for tomorrow at 3 PM")
.voiceId("en-US-male")
.outputContainer("raw")
.outputEncoding("pcmu") // μ-law for Twilio
.outputSampleRate(8000) // 8kHz
.outputBitDepth(16)
.outputChannels(1) // Mono
.outputFrameDurationMs(20) // 20ms frames
.build();
sdk.textToSpeechStream(
request,
audioChunk -> {
// Send to Twilio Media Stream
twilioStream.sendMedia(audioChunk);
}
);
Custom Audio Format
TTSRequest request = TTSRequest.builder()
.text("Custom format example")
.voiceId("mamre")
.outputContainer("raw")
.outputEncoding("pcm")
.outputSampleRate(16000) // 16kHz
.outputBitDepth(16) // 16-bit
.outputChannels(1) // Mono
.outputFrameDurationMs(100) // 100ms frames
.build();
byte[] audio = sdk.textToSpeech(request);
// Expected: 16kHz PCM, 16-bit, mono
High Quality Audio
TTSRequest request = TTSRequest.builder()
.text("This is a high quality recording")
.voiceId("mamre")
.highQuality() // WAV @ 44.1kHz
.build();
byte[] audio = sdk.textToSpeech(request);
Files.write(Paths.get("high_quality.wav"), audio);
Error Handling
import com.talktopc.sdk.exception.TtsException;
try {
byte[] audio = sdk.textToSpeech("Test", "mamre");
} catch (TtsException e) {
System.err.println("TTS Error [" + e.getStatusCode() + "]: " + e.getErrorMessage());
switch (e.getStatusCode()) {
case 401:
System.err.println("→ Invalid API key");
break;
case 402:
System.err.println("→ Insufficient credits");
break;
case 400:
System.err.println("→ Invalid parameters");
break;
default:
System.err.println("→ Other error");
}
}
Full Configuration Example
import com.talktopc.sdk.models.TTSRequest;
import com.talktopc.sdk.models.TTSResponse;
// Build request with all options
TTSRequest request = TTSRequest.builder()
.text("Full configuration example")
.voiceId("mamre")
.speed(1.2)
.outputContainer("wav")
.outputEncoding("pcm")
.outputSampleRate(24000)
.outputBitDepth(16)
.outputChannels(1)
.build();
// Get response with metadata
TTSResponse response = sdk.synthesize(request);
System.out.println("Audio: " + response.getAudioSizeBytes() + " bytes");
System.out.println("Sample rate: " + response.getSampleRate() + " Hz");
System.out.println("Duration: " + response.getDurationMs() + " ms");
System.out.println("Credits: " + response.getCreditsUsed());
// Save audio
Files.write(Paths.get("output.wav"), response.getAudio());
Supported Audio Formats
| Format | Encoding | Sample Rates | Use Case |
|---|---|---|---|
| PCM | pcm |
8000, 16000, 22050, 24000, 44100 Hz | General purpose, high quality |
| PCMU (μ-law) | pcmu |
8000 Hz | Phone systems (Twilio, Telnyx, VoIP) |
| PCMA (A-law) | pcma |
8000 Hz | Phone systems (European standards) |
Requirements
- Java 11 or higher
- Valid TalkToPC API key
- No external dependencies - Uses Java 11+ HttpClient
- Backend-only: Designed for server-side use, not browser
- Format Pass-through: Can request PCMU/PCMA and forward directly to phone systems
- No Audio Playback: Returns raw audio bytes - you handle playback/forwarding
- REST API: Uses REST endpoints instead of WebSocket
Resources
- GitHub Repository: TTP-GO/java-sdk
- Maven Package:
com.talktopc:ttp-agent-sdk-java:1.0.5 - Documentation: See
README.mdin the repository - Examples: Check
src/main/java/com/talktopc/sdk/examples/