Introduction

TTP Agent SDK is a powerful JavaScript library for building AI-powered voice and chat interactions in your web applications.

Key Features

๐Ÿ”’
Secure Authentication

Backend-to-backend signed link authentication with JWT and TTL-based expiration

๐ŸŽจ
Fully Customizable

Colors, branding, languages, RTL support, and custom agent settings

๐Ÿ“ฑ
Mobile Optimized

Works seamlessly on desktop, tablet, and mobile devices

๐ŸŒ
Multi-language

Built-in support for multiple languages and custom translations

โšก
Real-time Streaming

WebSocket-based audio streaming with low latency

๐Ÿ”ง
Easy Integration

Simple CDN setup or NPM package with comprehensive API

Installation

NPM

npm install ttp-agent-sdk

CDN

<script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>

Import

// ES6 Import
import { VoiceSDK } from 'ttp-agent-sdk';

// CommonJS
const { VoiceSDK } = require('ttp-agent-sdk');

// Browser Global
const sdk = new window.TTPAgentSDK.VoiceSDK(config);

Quick Start

Get up and running in 5 minutes with this simple example.

1

Get a Signed URL

๐Ÿ”’ Backend-to-Backend: Your frontend requests from YOUR backend, which then communicates with TTP backend. Never expose your API key to the frontend!

Frontend โ†’ Your Backend:

const response = await fetch('/api/get-voice-session', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    agentId: 'agent_123'
  })
});

const { signedUrl } = await response.json();

Your Backend โ†’ TTP Backend: (See Authentication section for details)

2

Initialize the SDK

Create a VoiceSDK instance with the signed URL:

import { VoiceSDK } from 'ttp-agent-sdk';

const voiceSDK = new VoiceSDK({
  signedUrl: signedUrl,  // signedUrl from step 1
  appId: 'your_app_id',     // Your application ID
  agentId: 'agent_123',     // The AI agent to connect to
  
  // Optional: Configure audio formats (v2 protocol)
  outputContainer: 'raw',      // 'raw' or 'wav'
  outputEncoding: 'pcm',       // 'pcm', 'pcmu', 'pcma'
  outputSampleRate: 44100,     // Higher quality audio
  protocolVersion: 2           // Use v2 protocol for format negotiation
});

// Listen to events
voiceSDK.on('connected', () => {
  console.log('โœ… Connected to agent');
});

voiceSDK.on('formatNegotiated', (format) => {
  console.log('โœ… Format negotiated:', format);
  // Format contains: container, encoding, sampleRate, channels, bitDepth
});

voiceSDK.on('message', (msg) => {
  if (msg.t === 'agent_response') {
    console.log('Agent:', msg.agent_response);
  }
});
3

Connect & Start Recording

Connect to the agent and start capturing audio:

// Connect
await voiceSDK.connect();

// Start recording
await voiceSDK.startRecording();

// Stop recording
await voiceSDK.stopRecording();

Authentication

TTP Agent SDK uses signed WebSocket URLs for secure authentication.

๐Ÿ”’ Security First: All connections require signed URLs obtained from your backend, which authenticates with the TTP backend using an API key.

Authentication Flow

๐Ÿ” Critical Security Note: The signed URL generation happens Backend-to-Backend. Your frontend NEVER directly contacts TTP backend. This protects your API key!
1. ๐Ÿ–ฅ๏ธ Your Frontend

User clicks "Start Voice Chat"

โ†“
2. ๐Ÿ”ง Your Backend

Receives: POST /api/get-voice-session

โ†“ ๐Ÿ”’ Backend-to-Backend (API Key)
3. ๐Ÿข TTP Backend

Validates API key & generates signed JWT

โ†‘ Returns signed URL
4. ๐Ÿ”ง Your Backend

Forwards signed URL to frontend

โ†“
5. ๐Ÿ–ฅ๏ธ Your Frontend

Uses SDK with signed URL (WebSocket)

Backend Implementation (Backend-to-Backend)

Your backend acts as a secure proxy between your frontend and TTP backend:

๐Ÿ’ก Architecture: Frontend โ†’ Your Backend โ†’ TTP Backend โ†’ Your Backend โ†’ Frontend
โฑ๏ธ Configurable TTL: You can specify expirationMs (in milliseconds) to control how long the signed URL is valid. Default is 1 hour (3600000ms) if not specified.

Examples:
  • 5 minutes: 300000
  • 1 hour (default): 3600000
  • 24 hours: 86400000
// Example: Node.js/Express
// This endpoint is called by YOUR FRONTEND
app.post('/api/get-voice-session', async (req, res) => {
  const { agentId } = req.body;
  
  // 1๏ธโƒฃ Authenticate YOUR user (your auth logic)
  if (!req.user) {
    return res.status(401).json({ error: 'Unauthorized' });
  }
  
  // 2๏ธโƒฃ Backend-to-Backend: Call TTP Backend with YOUR API Key
  // โš ๏ธ This happens SERVER-SIDE only - API key never exposed to frontend
  const response = await fetch('https://backend.talktopc.com/api/public/agents/signed-url', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.TTP_API_KEY}`  // ๐Ÿ”’ Secret - never in frontend!
    },
    body: JSON.stringify({
      agentId: agentId,
      appId: process.env.TTP_APP_ID,
      expirationMs: 3600000,  // Optional: Token TTL in milliseconds (default: 1 hour)
      allowOverride: true     // Optional: Enable agent settings override (default: false)
    })
  });
  
  const { signedLink } = await response.json();
  
  // 3๏ธโƒฃ Return signed URL to YOUR frontend
  res.json({ signedUrl: signedLink });
});
โš ๏ธ Never expose your API key: The API key should only be used on your backend server, never in frontend code.

Response Format

The signed URL endpoint returns the following response:

{
  "signedLink": "wss://speech.talktopc.com/ws/conv?signed_token=eyJ...",
  "agentId": "agent_123",
  "userId": "user_789",
  "appId": "your_app_id",
  "expiresAt": "2025-11-11T21:00:00.000+00:00",
  "expiresIn": 3600000,
  "generatedAt": "2025-11-11T20:00:00.000+00:00",
  "availableCredits": 150.5,
  "authenticationStatus": "SUCCESS"
}
Property Type Description
signedLink string The WebSocket URL with signed JWT token
agentId string The AI agent identifier
userId string Your user's identifier (ttpId)
appId string Your application identifier
expiresAt Date When the signed URL expires (ISO 8601 format)
expiresIn number Token validity duration in milliseconds
generatedAt Date When the signed URL was generated (ISO 8601 format)
availableCredits number User's remaining credit balance
authenticationStatus string Always "SUCCESS" for successful requests

JWT Token Properties

Property Description
agentId The AI agent identifier
userId Your user's identifier
appId Your application identifier
allowOverride Permission flag for agent settings override (optional)
exp Token expiration time (TTL - configurable via expirationMs, defaults to 1 hour)

Request Parameters

Parameter Type Required Description
agentId string Yes The AI agent identifier
appId string Yes Your application identifier
expirationMs number No Token TTL in milliseconds (default: 3600000 = 1 hour)
allowOverride boolean No Enable agent settings override permission (default: false)

Agent Settings Override

NEW FEATURE

Dynamically customize agent behavior, voice, and personality on a per-session basis.

๐Ÿ”’ Security: Agent overrides are only available with signed link authentication where allowOverride: true is granted by your backend.

How It Works

๐Ÿ” Backend-to-Backend First: The signed URL with override permission is obtained via backend-to-backend communication before your frontend can use it.
  1. Backend-to-Backend: Your backend requests a signed URL from TTP backend with allowOverride: true
  2. Backend โ†’ Frontend: Your backend returns the signed URL to your frontend
  3. Frontend: Your frontend uses the SDK with the signed URL + custom agent settings
  4. TTP Backend: Validates the JWT signature and applies your overrides if permission granted

Example

const voiceSDK = new VoiceSDK({
  signedUrl: signedUrl,  // signedUrl from your backend
  appId: 'your_app_id',     // Your application ID
  agentId: 'agent_123',     // The AI agent to connect to
  
  // Override agent settings
  agentSettingsOverride: {
    // Core settings
    prompt: "You are a friendly Spanish-speaking travel assistant",
    language: "es",
    temperature: 0.9,
    maxTokens: 200,
    
    // Voice settings
    voiceSpeed: 1.2,
    voiceId: "nova",  // Use voiceId (not selectedVoice)
    
    // Behavior
    firstMessage: "ยกHola! ยฟCรณmo puedo ayudarte hoy?",
    disableInterruptions: false,
    autoDetectLanguage: true,
    
    // Tools (optional)
    toolIds: [123, 456, 789],              // Custom tool IDs
    internalToolIds: ['calendar', 'email'] // Internal tool IDs
  }
});

Available Override Settings

15 out of 16 settings can be overridden. Only model selection is not supported (requires infrastructure changes).

๐Ÿ“ Core Settings

  • prompt - System prompt/instructions
  • temperature - LLM temperature (0-2)
  • maxTokens - Maximum response tokens
  • model - โš ๏ธ NOT SUPPORTED
  • language - Response language code

๐Ÿ”Š Voice Settings

  • voiceId - Specific voice ID
  • voiceSpeed - Speed multiplier (0.5-2)

โš™๏ธ Behavior

  • firstMessage - Initial greeting
  • disableInterruptions - Allow/prevent barge-in
  • autoDetectLanguage - Auto language detection
  • candidateLanguages - List of languages for auto-detection
  • maxCallDuration - Max session duration (seconds)

๐Ÿ› ๏ธ Advanced

  • toolIds - Array of custom tool IDs
  • internalToolIds - Array of internal tool IDs
  • timezone - User timezone
โš ๏ธ Validation: All overrides are validated and sanitized on the server. Invalid values will be rejected or clamped to safe ranges.

Variables in Hello Request

Overview

Variables allow you to pass dynamic values to your agent that will be used to replace placeholders in the system prompt and first message. Variables sent in the hello request take precedence over default variables stored in the agent configuration.

Hello Message Format

SDK v2 Format (Recommended)

When using VoiceSDK v2, variables are passed in the SDK constructor:

import { VoiceSDK_v2 } from 'ttp-agent-sdk';

// Get signed URL from your backend first
const response = await fetch('/api/get-voice-session', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ agentId: 'agent_5a2b984c1', appId: 'app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC' })
});
const { signedUrl } = await response.json();

const voiceSDK = new VoiceSDK_v2({
  signedUrl: signedUrl,  // Use signedUrl from backend
  agentId: 'agent_5a2b984c1',
  appId: 'app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC',
  
  // Variables (optional)
  variables: {
    USER_NAME: 'John',
    ACCOUNT_TYPE: 'premium',
    LANGUAGE: 'en-US'
  },
  
  // Audio format configuration
  sampleRate: 44100,
  channels: 1,
  bitDepth: 16,
  outputContainer: 'raw',
  outputEncoding: 'pcm',
  outputSampleRate: 44100,
  outputChannels: 1,
  outputBitDepth: 16,
  outputFrameDurationMs: 600
});

await voiceSDK.connect();

Raw WebSocket Format

If connecting via raw WebSocket (without SDK), send variables in the hello message:

{
  "t": "hello",
  "v": 2,
  "variables": {
    "USER_NAME": "John",
    "ACCOUNT_TYPE": "premium",
    "LANGUAGE": "en-US"
  },
  "inputFormat": {
    "encoding": "pcm",
    "sampleRate": 44100,
    "channels": 1,
    "bitDepth": 16
  },
  "requestedOutputFormat": {
    "encoding": "pcm",
    "sampleRate": 44100,
    "channels": 1,
    "bitDepth": 16,
    "container": "raw"
  },
  "outputFrameDurationMs": 600
}

Variable Format

Variables are sent as a JSON object where:

  • Keys: Variable names (case-sensitive, e.g., USER_NAME)
  • Values: String values that will replace {{VARIABLE_NAME}} in the prompt

Example

{
  "variables": {
    "USER_NAME": "John",
    "ACCOUNT_TYPE": "premium",
    "LANGUAGE": "en-US",
    "COMPANY": "Acme Corp"
  }
}

Variable Replacement Priority

Variables are replaced in the following priority order:

  1. Hello Variables (highest priority) - Variables sent in the hello request
  2. Default Variables - Variables stored in agent configuration (Redis)
  3. Leave as-is - If no value found, {{VARIABLE_NAME}} remains unchanged

Example Priority

Agent Configuration (Redis):

{
  "USER_NAME": "David",
  "ACCOUNT_TYPE": "premium"
}

Hello Request:

{
  "variables": {
    "USER_NAME": "John"
  }
}

Result:

  • {{USER_NAME}} โ†’ "John" (from hello - takes precedence)
  • {{ACCOUNT_TYPE}} โ†’ "premium" (from defaults - hello doesn't override)

Usage Examples

Example 1: JavaScript/TypeScript with SDK v2

import { VoiceSDK_v2 } from 'ttp-agent-sdk';

// Get signed URL from your backend first
const response = await fetch('/api/get-voice-session', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ agentId: 'agent_5a2b984c1', appId: 'app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC' })
});
const { signedUrl } = await response.json();

const voiceSDK = new VoiceSDK_v2({
  signedUrl: signedUrl,  // Use signedUrl from backend
  agentId: 'agent_5a2b984c1',
  appId: 'app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC',
  
  variables: {
    USER_NAME: 'John Doe',
    ACCOUNT_TYPE: 'premium',
    LANGUAGE: 'en-US'
  },
  
  // ... audio format config
});

await voiceSDK.connect();

Example 2: Raw WebSocket (JavaScript)

const ws = new WebSocket('wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC');

ws.onopen = () => {
  const helloMessage = {
    t: 'hello',
    v: 2,
    variables: {
      USER_NAME: 'John',
      ACCOUNT_TYPE: 'premium',
      LANGUAGE: 'en-US'
    },
    inputFormat: {
      encoding: 'pcm',
      sampleRate: 44100,
      channels: 1,
      bitDepth: 16
    },
    requestedOutputFormat: {
      encoding: 'pcm',
      sampleRate: 44100,
      channels: 1,
      bitDepth: 16,
      container: 'raw'
    },
    outputFrameDurationMs: 600
  };
  
  ws.send(JSON.stringify(helloMessage));
};

Example 3: Python WebSocket

import websocket
import json

def on_open(ws):
    hello_message = {
        "t": "hello",
        "v": 2,
        "variables": {
            "USER_NAME": "John",
            "ACCOUNT_TYPE": "premium",
            "LANGUAGE": "en-US"
        },
        "inputFormat": {
            "encoding": "pcm",
            "sampleRate": 44100,
            "channels": 1,
            "bitDepth": 16
        },
        "requestedOutputFormat": {
            "encoding": "pcm",
            "sampleRate": 44100,
            "channels": 1,
            "bitDepth": 16,
            "container": "raw"
        },
        "outputFrameDurationMs": 600
    }
    
    ws.send(json.dumps(hello_message))

ws = websocket.WebSocketApp(
    "wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC",
    on_open=on_open
)
ws.run_forever()

Example 4: cURL / wscat

# Using wscat
wscat -c "wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC"

# Then send:
{"t":"hello","v":2,"variables":{"USER_NAME":"John","ACCOUNT_TYPE":"premium"},"inputFormat":{"encoding":"pcm","sampleRate":44100,"channels":1,"bitDepth":16},"requestedOutputFormat":{"encoding":"pcm","sampleRate":44100,"channels":1,"bitDepth":16,"container":"raw"},"outputFrameDurationMs":600}

Agent Prompt Setup

To use variables in your agent, include placeholders in the system prompt or first message:

System Prompt Example

Your name is {{AGENT_NAME}}.
You are helping {{USER_NAME}} who has a {{ACCOUNT_TYPE}} account.
Speak in {{LANGUAGE}}.

First Message Example

Hello {{USER_NAME}}! Welcome to our {{ACCOUNT_TYPE}} service.

Backend Processing

When the hello request is received with variables:

  1. Variables are extracted from the hello message
  2. Default variables are loaded from agent configuration (Redis)
  3. Variables are merged (hello variables take precedence)
  4. Prompt is processed - {{VARIABLE_NAME}} placeholders are replaced
  5. First message is processed - Variables are replaced here too
  6. Metadata is added - Variables metadata section is appended to prompt

Server Logs

After sending hello with variables, check server logs for:

๐Ÿ“ Processing variables from hello message: [USER_NAME, ACCOUNT_TYPE, LANGUAGE]
โœ… Variables processed and prompt updated
๐Ÿ“ FINAL PROCESSED PROMPT (agentId: ...):
[Prompt with variables replaced]
โœ… Processed variables: 3 hello variables, 2 default variables, metadata: added

Variable Naming Conventions

  • Use UPPERCASE with underscores: USER_NAME, ACCOUNT_TYPE
  • Variable names are case-sensitive
  • Avoid special characters except underscores
  • Recommended format: {{VARIABLE_NAME}} in prompts

Common Use Cases

1. User Personalization

{
  "variables": {
    "USER_NAME": "John Doe",
    "USER_EMAIL": "john@example.com"
  }
}

2. Account Context

{
  "variables": {
    "ACCOUNT_TYPE": "premium",
    "SUBSCRIPTION_STATUS": "active"
  }
}

3. Language/Localization

{
  "variables": {
    "LANGUAGE": "en-US",
    "CURRENCY": "USD"
  }
}

4. Session Context

{
  "variables": {
    "SESSION_ID": "abc123",
    "PAGE_URL": "https://example.com/products"
  }
}

Error Handling

Missing Variables

If a variable is referenced in the prompt but not provided:

  • Has default value: Uses default from agent configuration
  • No default value: Placeholder remains unchanged ({{VARIABLE_NAME}})

Invalid Variable Format

  • Variables must be a JSON object
  • Values should be strings (will be converted to string if needed)
  • Empty object {} or null is valid (will use defaults only)

Best Practices

  1. Set defaults in agent configuration for all variables
  2. Override with hello variables only when you have dynamic values
  3. Use descriptive names that clearly indicate the variable's purpose
  4. Document variables in your agent's description or notes
  5. Test variables by checking server logs for "FINAL PROCESSED PROMPT"

API Reference

Hello Message Structure

interface HelloMessage {
  t: "hello";                    // Message type
  v?: number;                    // SDK version (2 for v2)
  variables?: {                   // Optional variables object
    [key: string]: string;        // Variable name -> value mapping
  };
  inputFormat?: AudioFormat;     // Input audio format
  requestedOutputFormat?: AudioFormat; // Output audio format
  outputFrameDurationMs?: number; // Frame duration for streaming
}

interface AudioFormat {
  encoding: "pcm" | "pcmu" | "pcma";
  sampleRate: number;
  channels: number;
  bitDepth: number;
  container?: "raw" | "wav";      // For output format only
}

Troubleshooting

Variables Not Being Replaced

  1. Check variable names match exactly (case-sensitive)
  2. Verify variables are sent in hello message (check logs)
  3. Check server logs for "FINAL PROCESSED PROMPT" to see actual replacement

Variables Not in Hello Message

  • SDK v2: Check if SDK supports variables in constructor
  • Raw WebSocket: Ensure variables field is included in JSON
  • Check WebSocket message is sent after connection opens

Default Variables Not Used

  • Verify variables are stored in Redis (check agent configuration)
  • Check extractVariablesFromAgentConfig is working
  • Look for "Default variables not found in state" warnings in logs

Events & Callbacks

The SDK emits events for all important state changes and interactions.

Event Categories

Connection Events

voiceSDK.on('connected', () => {
  console.log('โœ… Connected to agent');
});

voiceSDK.on('disconnected', (event) => {
  console.log('โŒ Disconnected:', event.reason);
  console.log('Close code:', event.code);
});

voiceSDK.on('error', (error) => {
  console.error('Error:', error);
});

Recording Events

voiceSDK.on('recordingStarted', () => {
  console.log('๐ŸŽค Recording started');
});

voiceSDK.on('recordingStopped', () => {
  console.log('โน๏ธ Recording stopped');
});

Message Events

voiceSDK.on('message', (msg) => {
  switch(msg.type) {
    case 'agent_response':
      console.log('Agent:', msg.agent_response);
      break;
    case 'transcription':
      console.log('You said:', msg.text);
      break;
    // ... other message types
  }
});

Audio Events

voiceSDK.on('playbackStarted', () => {
  console.log('๐Ÿ”Š Audio playback started');
});

voiceSDK.on('playbackStopped', () => {
  console.log('๐Ÿ”‡ Audio playback stopped');
});

voiceSDK.on('audioData', (audioData) => {
  // Raw audio data (Uint8Array)
});

Special Events

// Barge-in (user interrupts agent)
voiceSDK.on('bargeIn', (message) => {
  console.log('User interrupted the agent');
});

// Format negotiation (v2 protocol only)
voiceSDK.on('formatNegotiated', (format) => {
  console.log('Format negotiated:', format);
  // format: { container, encoding, sampleRate, channels, bitDepth }
});

// Greeting audio
voiceSDK.on('greetingStarted', () => {
  console.log('Playing greeting message');
});

// Domain whitelist error
voiceSDK.on('domainError', (error) => {
  console.error('Domain not whitelisted:', error.reason);
});

Protocol v2 - Format Negotiation

The SDK v2 introduces format negotiation, allowing you to specify exactly what audio format you want to receive from the server.

๐ŸŽฏ Key Benefits:
  • Format Control: Request specific audio formats (container, encoding, sample rate, bit depth)
  • Automatic Conversion: SDK automatically converts audio if backend sends different format
  • Quality Optimization: Choose optimal formats for your use case (e.g., 48kHz for high quality, 8kHz for bandwidth savings)
  • Backward Compatible: Works with v1 protocol (set protocolVersion: 1)

Supported Formats

Input Formats (What SDK Sends)

Property Supported Values
encoding 'pcm', 'pcmu' (ฮผ-law), 'pcma' (A-law)
sampleRate 8000, 16000, 22050, 24000, 44100, 48000 Hz
bitDepth 8, 16, 24 bits
channels 1 (mono only)

Output Formats (What SDK Receives)

Property Supported Values
container 'raw' (no header), 'wav' (with WAV header)
encoding 'pcm', 'pcmu' (ฮผ-law), 'pcma' (A-law)
sampleRate 8000, 16000, 22050, 24000, 44100, 48000 Hz
bitDepth 8, 16, 24 bits
channels 1 (mono only)

Format Negotiation Flow

1. SDK Initialization

Configure requested output format

โ†“
2. Connect & Send Hello

SDK sends format request in hello message

โ†“
3. Server Response

Server sends hello_ack with negotiated format

โ†“
4. Format Negotiated Event

SDK emits 'formatNegotiated' event

โ†“
5. Automatic Conversion

If formats differ, SDK converts automatically

Example: High-Quality Audio

const voiceSDK = new VoiceSDK({
  signedUrl: signedUrl,  // signedUrl from backend
  appId: 'your_app_id',
  agentId: 'agent_123',
  
  // Request high-quality audio
  outputContainer: 'raw',        // Raw PCM for lower latency
  outputEncoding: 'pcm',         // Uncompressed PCM
  outputSampleRate: 48000,       // 48kHz for high quality
  outputBitDepth: 16,            // 16-bit depth
  outputChannels: 1,             // Mono
  outputFrameDurationMs: 600,    // 600ms frames
  
  protocolVersion: 2             // Enable format negotiation
});

voiceSDK.on('formatNegotiated', (format) => {
  console.log('Negotiated format:', format);
  // If backend sends different format, SDK will convert automatically
});

Example: Bandwidth-Optimized Audio

const voiceSDK = new VoiceSDK({
  signedUrl: signedUrl,  // signedUrl from backend
  appId: 'your_app_id',
  agentId: 'agent_123',
  
  // Request compressed, low-bandwidth audio
  outputContainer: 'raw',
  outputEncoding: 'pcmu',        // ฮผ-law compression (8kHz equivalent)
  outputSampleRate: 8000,        // 8kHz for bandwidth savings
  outputBitDepth: 16,
  outputChannels: 1,
  
  protocolVersion: 2
});

Format Conversion

If the backend sends audio in a different format than requested, the SDK automatically converts it:

  • Container: WAV โ†” Raw PCM extraction/wrapping
  • Encoding: PCM โ†” PCMU/PCMA encoding/decoding
  • Sample Rate: Automatic resampling using Web Audio API
  • Bit Depth: 8-bit โ†” 16-bit โ†” 24-bit conversion
  • Channels: Mono/stereo conversion (if needed)
๐Ÿ’ก Best Practices:
  • Use protocolVersion: 2 for new projects
  • Request formats that match your use case (quality vs. bandwidth)
  • 48kHz is recommended for best quality (matches most browser defaults)
  • Raw PCM is lower latency than WAV (no header overhead)
  • Listen to formatNegotiated event to verify format

Voice & Chat Widget

Pre-built, customizable widget with voice and text chat - perfect for adding AI conversation to any website.

๐Ÿ’ฌ

Voice & Text Chat

Beautiful interface with voice recording, text chat, and message history

๐ŸŽจ

Fully Customizable

Colors, position, size, RTL support, and custom branding

๐Ÿ“ฑ

Mobile Optimized

Responsive design that works perfectly on all devices

๐ŸŒ

Multi-language

Built-in support for multiple languages with custom translations

Installation

<!-- Add the SDK script to your page -->
<script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>

<!-- Initialize the widget -->
<script>
  const widget = new TTPAgentSDK.TTPChatWidget({
    agentId: 'agent_123',
    appId: 'your_app_id'
  });
</script>

Basic Configuration

const widget = new TTPAgentSDK.TTPChatWidget({
  // Required
  agentId: 'agent_123',       // Your AI agent ID
  appId: 'your_app_id',       // Your application ID
  
  // Optional - Signed URL (recommended for production)
  signedUrl: 'wss://speech.talktopc.com/ws/conv?signed_token=...', // Signed URL from your backend
  
  // Optional - Agent Settings Override (requires signed URL with allowOverride=true)
  agentSettingsOverride: {
    prompt: "You are a helpful customer service assistant.",
    temperature: 0.8,
    voiceId: "F2",
    voiceSpeed: 1.2,
    firstMessage: "Hello! How can I help you today?",
    disableInterruptions: false,
    maxCallDuration: 600
  },
  
  // Optional - Appearance
  primaryColor: '#7C3AED',    // Widget theme color
  position: 'bottom-right',   // 'bottom-right', 'bottom-left'
  language: 'en',             // 'en', 'es', 'fr', 'de', 'he', etc.
  direction: 'ltr',           // 'ltr' or 'rtl' for right-to-left languages
  
  // Optional - Variables
  variables: {
    userName: 'John Doe',
    page: 'homepage',
    customData: 'value'
  }
});

Signed Link Authentication (Production)

๐Ÿ”’ Security Best Practice: For production applications, use signed links instead of exposing agent IDs directly. Signed links provide secure, time-limited authentication.

To use signed links with the widget, provide a signedUrl parameter. You should fetch this signed URL from your backend before initializing the widget:

// Step 1: Get signed URL from your backend
async function initializeWidget() {
  // Request signed URL from your backend
  const response = await fetch('/api/get-voice-session', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${getAuthToken()}` // Your auth token
    },
    body: JSON.stringify({
      agentId: 'agent_123',
      appId: 'your_app_id',
      variables: {
        userName: 'John Doe',
        page: 'homepage'
      }
    })
  });
  
  if (!response.ok) {
    throw new Error(`Backend API error: ${response.status}`);
  }
  
  const data = await response.json();
  
  // Backend must return signedUrl field
  if (!data.signedUrl) {
    throw new Error('Backend response must contain "signedUrl" field');
  }
  
  // Step 2: Initialize widget with signedUrl
  const widget = new TTPAgentSDK.TTPChatWidget({
    agentId: 'agent_123',
    appId: 'your_app_id',
    signedUrl: data.signedUrl,  // Use signed URL from backend
    variables: {
      userName: 'John Doe',
      page: 'homepage'
    }
  });
}

// Initialize when ready
initializeWidget();
๐Ÿ“‹ Important:
  • The widget accepts signedUrl as a direct parameter (not getSessionUrl)
  • You must fetch the signed URL from your backend before initializing the widget
  • Your backend must return a response with a signedUrl field (not websocketUrl, url, or any other field)
  • If signedUrl is not provided, the widget will construct a URL from agentId and appId (less secure)
  • Your backend should authenticate with TTP backend using your API key (see Authentication section)

Backend Implementation Example

Your backend endpoint should request signed URLs from TTP:

// Your Backend (Node.js/Express example)
app.post('/api/get-voice-session', async (req, res) => {
  const { agentId, appId, variables } = req.body;
  
  // Authenticate your user
  if (!req.user) {
    return res.status(401).json({ error: 'Unauthorized' });
  }
  
  // Backend-to-Backend: Request signed URL from TTP
  const response = await fetch('https://backend.talktopc.com/api/public/agents/signed-url', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.TTP_API_KEY}` // Your TTP API key
    },
    body: JSON.stringify({
      agentId,
      appId,
      variables
    })
  });
  
  if (!response.ok) {
    return res.status(500).json({ error: 'Failed to get signed URL' });
  }
  
  const data = await response.json();
  
  // Return signedUrl to frontend
  res.json({
    signedUrl: data.signedUrl
  });
});

Advanced Customization

Icon Customization

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  icon: {
    type: 'custom',                              // 'default', 'emoji', or 'custom'
    customImage: 'https://your-site.com/logo.png', // Custom image URL
    size: 60,                                    // Icon size in pixels
    backgroundColor: '#FFFFFF',                  // Background color
    borderRadius: '50%'                         // Border radius (50% for circle)
  }
});

Chat Window Customization

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  chatWindow: {
    width: 400,                    // Width in pixels
    height: 600,                   // Height in pixels
    title: 'Chat with us!',        // Custom title
    subtitle: 'We reply instantly', // Custom subtitle
    placeholder: 'Type here...',   // Input placeholder
    borderRadius: 12               // Window border radius
  }
});

Branding

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  branding: {
    companyName: 'Your Company',
    logo: 'https://your-site.com/logo.png',
    showPoweredBy: false           // Hide "Powered by TTP" footer
  }
});

Agent Settings Override

๐Ÿ”’ Security: Agent settings override requires a signed URL with allowOverride: true granted by your backend.

Dynamically customize agent behavior, voice, and personality on a per-session basis:

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  signedUrl: 'wss://speech.talktopc.com/ws/conv?signed_token=...', // Required: signed URL with allowOverride=true
  
  // Override agent settings dynamically
  agentSettingsOverride: {
    // Core settings
    prompt: "You are a friendly customer service assistant",
    temperature: 0.8,
    maxTokens: 200,
    
    // Voice settings
    voiceId: "F2",
    voiceSpeed: 1.2,
    
    // Behavior
    firstMessage: "Hello! How can I help you today?",
    disableInterruptions: false,
    maxCallDuration: 600,
    
    // Language
    language: "en",
    autoDetectLanguage: false
  }
});

See the Agent Settings Override section for complete documentation of all available override settings.

RTL (Right-to-Left) Support

// For Hebrew, Arabic, etc.
const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  direction: 'rtl',
  language: 'he',                  // Hebrew
  position: 'bottom-left'          // Better for RTL
});

Widget Methods

widget.open()

Programmatically open the chat window.

// Open chat from your own button
document.getElementById('myButton').onclick = () => {
  widget.open();
};

widget.close()

Close the chat window.

widget.close();

widget.toggle()

Toggle chat window open/closed.

widget.toggle();

widget.destroy()

Remove the widget from the page.

widget.destroy();

widget.updateConfig(config)

Update widget configuration dynamically.

widget.updateConfig({
  primaryColor: '#FF5733',
  language: 'es'
});

Widget Events

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  // Event callbacks
  onOpen: () => {
    console.log('Chat opened');
  },
  
  onClose: () => {
    console.log('Chat closed');
  },
  
  onMessage: (message) => {
    console.log('New message:', message);
  },
  
  onError: (error) => {
    console.error('Widget error:', error);
  },
  
  onReady: () => {
    console.log('Widget initialized');
  }
});

Complete Example

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>My Website with AI Chat</title>
</head>
<body>
  <h1>Welcome to my website!</h1>
  
  <!-- Load the SDK -->
  <script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>
  
  <!-- Initialize widget -->
  <script>
    const widget = new TTPAgentSDK.TTPChatWidget({
      agentId: 'agent_123',
      appId: 'your_app_id',
      
      // Customize appearance
      primaryColor: '#7C3AED',
      position: 'bottom-right',
      language: 'en',
      
      // Custom branding
      chatWindow: {
        title: 'Chat with us!',
        subtitle: 'We typically reply instantly'
      },
      
      // Pass context variables
      variables: {
        userName: 'Visitor',
        page: window.location.pathname,
        referrer: document.referrer
      },
      
      // Event handlers
      onReady: () => {
        console.log('Chat widget ready!');
      },
      
      onMessage: (message) => {
        // Track messages in analytics
        console.log('Message:', message);
      }
    });
    
    // Optional: Open chat programmatically
    // widget.open();
  </script>
</body>
</html>

Configuration Reference

๐ŸŽจ Extensive Customization

The Voice & Chat Widget can be customized in almost every aspect - colors, text, icons, sizes, behaviors, and more!

See the live demo to experiment with all customization options interactively.

Required Configuration
Property Type Description
agentId string Your AI agent identifier
appId string Your application identifier
General Configuration
Property Type Default Description
primaryColor string '#7C3AED' Main theme color (hex)
direction string 'ltr' 'ltr' or 'rtl'
language string 'en' Language code (en, es, fr, de, he, ar, etc.)
variables object {} Custom variables to pass to agent
signedUrl string | function null Signed URL for secure authentication (string or async function that returns URL)
agentSettingsOverride object null Override agent settings dynamically (requires signed URL with allowOverride=true). See Agent Settings Override for details.
customStyles string '' Custom CSS to inject
Positioning
Property Type Default Description
position string | object 'bottom-right' String: 'bottom-right', 'bottom-left'. Object: { vertical, horizontal, offset }
position.vertical string 'bottom' 'top' or 'bottom'
position.horizontal string 'right' 'left' or 'right'
position.offset object { x: 20, y: 20 } Offset from edges (pixels)
Icon & Button
Property Type Default Description
icon.type string 'custom' 'microphone', 'custom', 'emoji', 'text'
icon.customImage string TTP logo URL to custom icon image
icon.size string 'medium' 'small', 'medium', 'large', 'xl'
icon.backgroundColor string '#FFFFFF' Icon background color
button.size string 'medium' 'small', 'medium', 'large'
button.shape string 'circle' 'circle', 'rounded', 'square'
button.backgroundColor string primaryColor Button background color
button.hoverColor string '#7C3AED' Button hover color
button.shadow boolean true Enable button shadow
Panel & Header
Property Type Default Description
panel.width number 350 Panel width (pixels)
panel.height number 500 Panel height (pixels)
panel.borderRadius number 12 Border radius (pixels)
panel.backgroundColor string '#FFFFFF' Panel background color
header.title string 'Chat Assistant' Header title text
header.showTitle boolean true Show/hide header title
header.backgroundColor string '#7C3AED' Header background color
header.textColor string '#FFFFFF' Header text color
Messages & Chat
Property Type Default Description
messages.userBackgroundColor string '#E5E7EB' User message background
messages.agentBackgroundColor string '#F3F4F6' Agent message background
messages.systemBackgroundColor string '#DCFCE7' System message background
messages.errorBackgroundColor string '#FEE2E2' Error message background
messages.textColor string '#1F2937' Message text color
messages.fontSize string '14px' Message font size
messages.borderRadius number 8 Message bubble radius
text.sendButtonColor string '#7C3AED' Send button color
text.sendButtonHoverColor string '#6D28D9' Send button hover color
text.sendButtonActiveColor string '#6D28D9' Send button active color
text.sendButtonText string 'โžค' Send button text/icon
text.sendButtonTextColor string '#FFFFFF' Send button text color
text.sendButtonFontSize string '18px' Send button font size
text.sendButtonFontWeight string '500' Send button font weight
text.inputPlaceholder string 'Type your message...' Input placeholder text
text.inputBorderColor string '#E5E7EB' Input border color
text.inputFocusColor string '#7C3AED' Input focus color
text.inputBackgroundColor string '#FFFFFF' Input background color
text.inputTextColor string '#1F2937' Input text color
text.inputFontSize string '14px' Input font size
text.inputBorderRadius number 20 Input border radius (pixels)
text.inputPadding string '6px 14px' Input padding
Voice Configuration
Property Type Default Description
voice.micButtonColor string primaryColor Microphone button color (inside panel)
voice.micButtonActiveColor string '#EF4444' Microphone button color when active
voice.micButtonHint.text string 'Click the button to start...' Hint text below mic button
voice.micButtonHint.color string '#6B7280' Hint text color
voice.avatarBackgroundColor string '#667eea' Voice avatar background
voice.avatarActiveBackgroundColor string '#667eea' Avatar background when active
voice.statusTitleColor string '#1e293b' Status title text color
voice.statusSubtitleColor string '#64748b' Status subtitle text color
voice.startCallButtonColor string '#667eea' Start call button color
voice.startCallButtonTextColor string '#FFFFFF' Start call button text color
voice.endCallButtonColor string '#ef4444' End call button color
voice.transcriptBackgroundColor string '#FFFFFF' Transcript background
voice.transcriptTextColor string '#1e293b' Transcript text color
voice.transcriptLabelColor string '#94a3b8' Transcript label color
voice.controlButtonColor string '#FFFFFF' Control button color
voice.controlButtonSecondaryColor string '#64748b' Secondary control button color
voice.language string 'en' Voice language (overrides global)
Behavior
Property Type Default Description
behavior.mode string 'unified' 'unified' (both), 'voice-only', 'text-only'
behavior.autoOpen boolean false Auto-open widget on page load
behavior.startOpen boolean false Start with widget open
behavior.hidden boolean false Hide the widget completely
behavior.autoConnect boolean false Auto-connect on widget open
behavior.showWelcomeMessage boolean true Show welcome message
behavior.welcomeMessage string 'Hello! How can I help...' Welcome message text
behavior.enableVoiceMode boolean true Enable voice mode option (in unified mode)
Animation & Tooltips
Property Type Default Description
animation.enableHover boolean true Enable hover animations
animation.enablePulse boolean true Enable pulse animations
animation.enableSlide boolean true Enable slide animations
animation.duration number 0.3 Animation duration (seconds)
tooltips.newChat string Auto New chat button tooltip
tooltips.back string Auto Back button tooltip
tooltips.close string Auto Close button tooltip
tooltips.mute string Auto Mute button tooltip
tooltips.speaker string Auto Speaker button tooltip
tooltips.endCall string Auto End call button tooltip
Landing Screen (Unified Mode)
Property Type Default Description
landing.backgroundColor string Gradient Landing screen background (color or gradient)
landing.logo string '๐Ÿค–' Landing screen logo (emoji or text)
landing.title string null Landing screen title (null uses translation)
landing.titleColor string '#1e293b' Landing title text color
landing.modeCardBackgroundColor string '#FFFFFF' Mode selection card background
landing.modeCardBorderColor string '#E2E8F0' Mode card border color
landing.modeCardHoverBorderColor string header.backgroundColor Mode card border color on hover
landing.modeCardIconBackgroundColor string header.backgroundColor Mode card icon background
landing.modeCardTitleColor string '#111827' Mode card title text color
landing.voiceCardIcon string '๐ŸŽค' Voice mode card icon
landing.textCardIcon string '๐Ÿ’ฌ' Text mode card icon
Advanced Configuration
Property Type Default Description
signedUrl string Optional Signed WebSocket URL for secure authentication (for voice). If not provided, URL is constructed from agentId/appId.
demo boolean true Enable demo mode
panel.backdropFilter string null CSS backdrop filter (e.g., 'blur(10px)')
panel.border string '1px solid rgba(0,0,0,0.1)' Panel border style
button.shadowColor string 'rgba(0,0,0,0.15)' Button shadow color
icon.emoji string '๐ŸŽค' Emoji when icon.type = 'emoji'
icon.text string 'AI' Text when icon.type = 'text'
accessibility.ariaLabel string 'Chat Assistant' ARIA label for the widget
accessibility.ariaDescription string 'Click to open chat assistant' ARIA description
accessibility.keyboardNavigation boolean true Enable keyboard navigation

โœ… Configuration Verified:

All configuration options listed above have been verified against the source code and are fully supported. The widget has 80+ customization options across 13 categories:

  • General (5 options)
  • Positioning (4 options)
  • Icon & Button (9 options)
  • Panel & Header (8 options)
  • Messages (7 options)
  • Text Chat (13 options)
  • Voice Interface (17 options)
  • Landing Screen (11 options)
  • Tooltips (6 options)
  • Animations (4 options)
  • Behavior (8 options)
  • Accessibility (3 options)
  • Advanced (11 options)

Experiment with all options interactively in the live demo.

๐Ÿ“– Tip: All configuration options support the spread operator (...), so you can pass additional custom properties that will be merged with defaults.

๐ŸŽฎ Live Demo: Try the fully functional widget with customization options at test-text-chat.html

Use Cases

๐Ÿ’ผ Customer Support

Add 24/7 AI-powered support to your website

๐Ÿ›’ E-commerce

Help customers find products and answer questions

๐Ÿ“š Documentation

Provide interactive help for your docs

๐ŸŽ“ Education

Create AI tutors and learning assistants

Vanilla JavaScript Guide

Use the SDK in any JavaScript application without frameworks.

Complete Example

import { VoiceSDK } from 'ttp-agent-sdk';

class VoiceAssistant {
  constructor() {
    this.sdk = null;
    this.isConnected = false;
    this.isRecording = false;
  }

  async initialize(agentId, overrides = {}) {
    // Step 1: Get signed URL
    const signedUrl = await this.getSignedUrl(agentId);

    // Step 2: Create SDK
    this.sdk = new VoiceSDK({
      signedUrl: signedUrl,  // signedUrl from backend
      appId: 'your_app_id',
      agentId: agentId,
      agentSettingsOverride: overrides
    });

    // Setup event listeners
    this.setupEventListeners();

    // Connect
    await this.sdk.connect();
  }

  setupEventListeners() {
    this.sdk.on('connected', () => {
      this.isConnected = true;
      this.updateUI('connected');
    });

    this.sdk.on('disconnected', () => {
      this.isConnected = false;
      this.isRecording = false;
      this.updateUI('disconnected');
    });

    this.sdk.on('recordingStarted', () => {
      this.isRecording = true;
      this.updateUI('recording');
    });

    this.sdk.on('recordingStopped', () => {
      this.isRecording = false;
      this.updateUI('connected');
    });

    this.sdk.on('message', (msg) => {
      if (msg.type === 'agent_response') {
        this.displayMessage('agent', msg.agent_response);
      }
    });

    this.sdk.on('error', (error) => {
      this.handleError(error);
    });
  }

  async getSignedUrl(agentId) {
    const response = await fetch('/api/get-voice-session', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        agentId: agentId
      })
    });
    
    const { signedUrl } = await response.json();
    return signedUrl;
  }

  async toggleRecording() {
    if (!this.isConnected) return;

    if (this.isRecording) {
      await this.sdk.stopRecording();
    } else {
      await this.sdk.startRecording();
    }
  }

  disconnect() {
    if (this.sdk) {
      this.sdk.disconnect();
      this.sdk = null;
    }
  }

  updateUI(state) {
    // Update your UI based on state
  }

  displayMessage(role, text) {
    // Display message in your UI
  }

  handleError(error) {
    // Handle errors
  }
}

// Usage
const assistant = new VoiceAssistant();

await assistant.initialize('agent_123', {
  language: 'es',
  temperature: 0.9
});

React Integration

Use the SDK in React applications with hooks and components.

Using Hooks

import React, { useState, useEffect, useRef } from 'react';
import { VoiceSDK } from 'ttp-agent-sdk';

function VoiceChat() {
  const [status, setStatus] = useState('disconnected');
  const [isRecording, setIsRecording] = useState(false);
  const [messages, setMessages] = useState([]);
  const sdkRef = useRef(null);

  // Initialize SDK
  useEffect(() => {
    async function initSDK() {
      // Get signed URL
      const response = await fetch('/api/get-voice-session', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({
          agentId: 'agent_123'
        })
      });
      
      const { signedUrl } = await response.json();

      // Create SDK
      const sdk = new VoiceSDK({
        signedUrl: signedUrl,  // signedUrl from backend
        appId: 'your_app_id',
        agentId: 'agent_123',
        agentSettingsOverride: {
          language: 'es',
          temperature: 0.9
        }
      });

      // Event listeners
      sdk.on('connected', () => setStatus('connected'));
      sdk.on('disconnected', () => {
        setStatus('disconnected');
        setIsRecording(false);
      });
      sdk.on('recordingStarted', () => setIsRecording(true));
      sdk.on('recordingStopped', () => setIsRecording(false));
      sdk.on('message', (msg) => {
        if (msg.type === 'agent_response') {
          setMessages(prev => [
            ...prev,
            { role: 'agent', text: msg.agent_response }
          ]);
        }
      });

      await sdk.connect();
      sdkRef.current = sdk;
    }

    initSDK();

    // Cleanup
    return () => {
      if (sdkRef.current) {
        sdkRef.current.disconnect();
      }
    };
  }, []);

  const toggleRecording = async () => {
    if (sdkRef.current) {
      await sdkRef.current.toggleRecording();
    }
  };

  return (
    <div>
      <div>Status: {status}</div>
      <button onClick={toggleRecording} disabled={status !== 'connected'}>
        {isRecording ? 'Stop' : 'Start'} Recording
      </button>
      <div>
        {messages.map((msg, i) => (
          <div key={i}>{msg.role}: {msg.text}</div>
        ))}
      </div>
    </div>
  );
}

export default VoiceChat;

VoiceButton Component

Pre-built React component for quick integration.

Installation

import { VoiceButton } from 'ttp-agent-sdk/react';

Basic Usage

import React, { useState, useEffect } from 'react';
import { VoiceButton } from 'ttp-agent-sdk/react';

function App() {
  const [signedUrl, setSignedUrl] = useState(null);

  useEffect(() => {
    async function fetchSignedUrl() {
      const response = await fetch('/api/get-voice-session', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({
          agentId: 'agent_123'
        })
      });
      
      const { signedUrl } = await response.json();
      setSignedUrl(signedUrl);
    }
    
    fetchSignedUrl();
  }, []);

  if (!signedUrl) return <div>Loading...</div>;

  return (
    <VoiceButton
      signedUrl={signedUrl}  {/* signedUrl from backend */}
      appId="your_app_id"
      agentId="agent_123"
      agentSettingsOverride={{
        language: 'es',
        temperature: 0.9
      }}
      onConnected={() => console.log('Connected')}
      onMessage={(msg) => console.log('Message:', msg)}
      onError={(error) => console.error('Error:', error)}
    />
  );
}

Props

Prop Type Required Description
signedUrl string Yes Signed WebSocket URL
appId string Yes Your application ID
agentId string Yes The AI agent identifier
agentSettingsOverride object No Custom agent settings
voice string No Voice preset (default: 'default')
language string No Language code (default: 'en')
autoReconnect boolean No Auto-reconnect on disconnect (default: true)
className string No Custom CSS class for the button
style object No Inline styles for the button
children React.Node No Custom button content (replaces default icon)

Event Callbacks

Callback Parameters Description
onConnected - Called when successfully connected
onDisconnected - Called when disconnected
onRecordingStarted - Called when recording starts
onRecordingStopped - Called when recording stops
onPlaybackStarted - Called when audio playback starts
onPlaybackStopped - Called when audio playback stops
onMessage message Called for all WebSocket messages
onError error Called on errors
onBargeIn message Called when user interrupts agent
onStopPlaying message Called when server requests to stop audio

VoiceSDK Class

Core class for voice interaction functionality.

Constructor

new VoiceSDK(config)

Configuration Object

Property Type Required Description
signedUrl string Yes Signed WebSocket URL from your backend
appId string Yes Your application identifier
agentId string Yes The AI agent identifier to connect to
agentSettingsOverride object No Custom agent configuration
voice string No Voice preset name (default: 'default')
language string No Language code (default: 'en')
sampleRate number No Input audio sample rate: 8000, 16000, 22050, 24000, 44100, or 48000 Hz (default: 16000)
channels number No Input audio channels (default: 1, mono only)
bitDepth number No Input audio bit depth: 8, 16, or 24 bits (default: 16)
outputContainer string No Output container format: 'raw' or 'wav' (default: 'raw')
outputEncoding string No Output audio encoding: 'pcm', 'pcmu' (ฮผ-law), or 'pcma' (A-law) (default: 'pcm')
outputSampleRate number No Output audio sample rate: 8000, 16000, 22050, 24000, 44100, or 48000 Hz (default: 16000)
outputChannels number No Output audio channels (default: 1, mono only)
outputBitDepth number No Output audio bit depth: 8, 16, or 24 bits (default: 16)
outputFrameDurationMs number No Frame duration for raw PCM streaming in milliseconds (default: 600)
protocolVersion number No Protocol version: 1 (legacy) or 2 (format negotiation) (default: 2)
autoReconnect boolean No Auto-reconnect on disconnect (default: true)

Audio Format Configuration (v2 Protocol)

The SDK v2 supports format negotiation with the backend. You can specify both input and output audio formats:

๐Ÿ“‹ Format Support:
  • Input Encodings: PCM, PCMU (ฮผ-law), PCMA (A-law)
  • Input Sample Rates: 8000, 16000, 22050, 24000, 44100, 48000 Hz
  • Input Bit Depths: 8, 16, 24 bits
  • Output Containers: 'raw' (no header) or 'wav' (with header)
  • Output Encodings: PCM, PCMU (ฮผ-law), PCMA (A-law)
  • Output Sample Rates: 8000, 16000, 22050, 24000, 44100, 48000 Hz
  • Output Bit Depths: 8, 16, 24 bits

Example: Custom Audio Format

const voiceSDK = new VoiceSDK({
  signedUrl: signedUrl,  // signedUrl from backend
  appId: 'your_app_id',
  agentId: 'agent_123',
  
  // Input format (what we send to server)
  sampleRate: 16000,
  channels: 1,
  bitDepth: 16,
  
  // Output format (what we want from server)
  outputContainer: 'raw',        // 'raw' or 'wav'
  outputEncoding: 'pcm',          // 'pcm', 'pcmu', 'pcma'
  outputSampleRate: 44100,       // Higher quality
  outputChannels: 1,
  outputBitDepth: 16,
  outputFrameDurationMs: 600,    // Frame duration for streaming
  
  // Protocol version
  protocolVersion: 2              // Use v2 protocol for format negotiation
});

// Listen for format negotiation
voiceSDK.on('formatNegotiated', (format) => {
  console.log('Format negotiated:', format);
  // format contains: { container, encoding, sampleRate, channels, bitDepth }
});

Methods

connect()

Connect to the voice agent.

Returns: Promise<boolean>

await voiceSDK.connect();

disconnect()

Disconnect from the voice agent.

Returns: void

voiceSDK.disconnect();

startRecording()

Start capturing and streaming audio.

Returns: Promise<boolean>

await voiceSDK.startRecording();

stopRecording()

Stop capturing audio.

Returns: Promise<boolean>

await voiceSDK.stopRecording();

toggleRecording()

Toggle recording state (start/stop).

Returns: Promise<boolean>

await voiceSDK.toggleRecording();

getStatus()

Get current connection and recording status.

Returns: Object

const status = voiceSDK.getStatus();
// Returns: {
//   version: '2.0.0',
//   isConnected: boolean,
//   isRecording: boolean,
//   isPlaying: boolean,
//   outputFormat: object,      // Negotiated output format (v2)
//   audioPlayer: object,       // AudioPlayer status
//   audioRecorder: object      // AudioRecorder status
// }

validateInputFormat(format)

v2 only: Validate input audio format configuration.

Parameters:

  • format (object) - Format object with encoding, sampleRate, bitDepth, channels

Returns: string|null - Error message if invalid, null if valid

const error = voiceSDK.validateInputFormat({
  encoding: 'pcm',
  sampleRate: 16000,
  bitDepth: 16,
  channels: 1
});
if (error) {
  console.error('Invalid format:', error);
}

validateOutputFormat(format)

v2 only: Validate output audio format configuration.

Parameters:

  • format (object) - Format object with container, encoding, sampleRate, bitDepth, channels

Returns: string|null - Error message if invalid, null if valid

const error = voiceSDK.validateOutputFormat({
  container: 'raw',
  encoding: 'pcm',
  sampleRate: 44100,
  bitDepth: 16,
  channels: 1
});
if (error) {
  console.error('Invalid format:', error);
}

updateConfig(newConfig)

Update SDK configuration dynamically.

Parameters:

  • newConfig (object) - Partial configuration object to merge with existing config

Returns: void

voiceSDK.updateConfig({
  outputSampleRate: 48000,
  outputEncoding: 'pcmu'
});

reconnect()

Manually reconnect to the agent.

Returns: Promise<boolean>

await voiceSDK.reconnect();

stopAudioPlayback()

Immediately stop audio playback (for barge-in).

Returns: void

voiceSDK.stopAudioPlayback();

on(event, callback)

Register an event listener.

Parameters:

  • event (string) - Event name
  • callback (function) - Event handler
voiceSDK.on('connected', () => {
  console.log('Connected!');
});

destroy()

Cleanup all resources and disconnect.

Returns: void

voiceSDK.destroy();

Events Reference

Event Parameters Description
connected - Emitted when successfully connected
disconnected event Emitted when disconnected (includes reason)
error error Emitted on errors
recordingStarted - Emitted when recording starts
recordingStopped - Emitted when recording stops
message message Emitted for all WebSocket messages
playbackStarted - Emitted when audio playback starts
playbackStopped - Emitted when audio playback stops
playbackError error Emitted on audio playback errors
bargeIn message Emitted when user interrupts agent
stopPlaying message Emitted when server requests to stop audio
formatNegotiated format v2 only: Emitted when audio format is negotiated with server. Format object contains: container, encoding, sampleRate, channels, bitDepth
greetingStarted - Emitted when greeting audio starts
domainError error Emitted when domain is not whitelisted

Configuration Options

Agent Settings Override

Complete reference for all overridable settings:

Core Settings

Setting Type Range/Values Description
prompt string Any text System prompt/instructions for the agent
temperature number 0.0 - 2.0 LLM creativity level
maxTokens number 1 - 4096 Maximum tokens per response
model string Model names โš ๏ธ NOT SUPPORTED - LLM model selection requires infrastructure changes
language string ISO codes Response language (e.g., 'en', 'es', 'fr')

Voice Settings

Setting Type Range/Values Description
voiceId string Voice IDs Specific voice identifier
voiceSpeed number 0.5 - 2.0 Voice speed multiplier

Behavior Settings

Setting Type Range/Values Description
firstMessage string Any text Initial greeting message
disableInterruptions boolean true/false Prevent user from interrupting agent
autoDetectLanguage boolean true/false Automatically detect user's language
candidateLanguages array Language codes List of candidate languages for auto-detection (e.g., ['en', 'es', 'fr'])
maxCallDuration number Seconds Maximum session duration

Advanced Settings

Setting Type Range/Values Description
toolIds array Array of numbers Array of custom tool IDs to enable for this agent (e.g., [123, 456, 789])
internalToolIds array Array of strings Array of internal tool IDs to enable for this agent (e.g., ['calendar', 'weather', 'email'])
timezone string TZ names User timezone (e.g., 'America/New_York')

Text-to-Speech API

PUBLIC REST API

Generate high-quality voice audio from text using our public REST API endpoint.

๐Ÿ”’ Authentication: This endpoint requires API key authentication. Never expose your API key in frontend code - always call from your backend.

Endpoint

POST https://backend.talktopc.com/api/public/agents/tts/generate

Authentication

Include your API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Request Parameters

Parameter Type Required Description
text string Yes The text to convert to speech
voiceId string No Voice identifier (default: agent's configured voice)
voiceSpeed number No Voice speed multiplier: 0.5 - 2.0 (default: 1.0)
language string No Language code (e.g., 'en', 'es', 'fr')
agentId string No Agent ID to use voice settings from

Example Requests

curl -X POST https://backend.talktopc.com/api/public/agents/tts/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "text": "Hello! Welcome to our service.",
    "voiceId": "nova",
    "voiceSpeed": 1.2,
    "language": "en"
  }' \
  --output speech.mp3
const response = await fetch('https://backend.talktopc.com/api/public/agents/tts/generate', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${process.env.TTP_API_KEY}`
  },
  body: JSON.stringify({
    text: 'Hello! Welcome to our service.',
    voiceId: 'nova',
    voiceSpeed: 1.2,
    language: 'en'
  })
});

// Response is audio file
const audioBuffer = await response.arrayBuffer();
const audioBlob = new Blob([audioBuffer], { type: 'audio/mpeg' });

// Play or save the audio
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();
import requests
import os

url = "https://backend.talktopc.com/api/public/agents/tts/generate"
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {os.environ['TTP_API_KEY']}"
}
data = {
    "text": "Hello! Welcome to our service.",
    "voiceId": "nova",
    "voiceSpeed": 1.2,
    "language": "en"
}

response = requests.post(url, json=data, headers=headers)

if response.status_code == 200:
    # Save audio to file
    with open("speech.mp3", "wb") as f:
        f.write(response.content)
    print("Audio saved to speech.mp3")
else:
    print(f"Error: {response.status_code}")
<?php
$url = "https://backend.talktopc.com/api/public/agents/tts/generate";
$apiKey = getenv('TTP_API_KEY');

$data = [
    'text' => 'Hello! Welcome to our service.',
    'voiceId' => 'nova',
    'voiceSpeed' => 1.2,
    'language' => 'en'
];

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Content-Type: application/json',
    'Authorization: Bearer ' . $apiKey
]);

$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

if ($httpCode === 200) {
    file_put_contents('speech.mp3', $response);
    echo "Audio saved to speech.mp3";
} else {
    echo "Error: HTTP $httpCode";
}
?>
import java.net.http.*;
import java.net.URI;
import java.nio.file.*;

public class TTSExample {
    public static void main(String[] args) throws Exception {
        String url = "https://backend.talktopc.com/api/public/agents/tts/generate";
        String apiKey = System.getenv("TTP_API_KEY");
        
        String json = """
            {
                "text": "Hello! Welcome to our service.",
                "voiceId": "nova",
                "voiceSpeed": 1.2,
                "language": "en"
            }
            """;
        
        HttpClient client = HttpClient.newHttpClient();
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .header("Content-Type", "application/json")
            .header("Authorization", "Bearer " + apiKey)
            .POST(HttpRequest.BodyPublishers.ofString(json))
            .build();
        
        HttpResponse<byte[]> response = client.send(
            request, 
            HttpResponse.BodyHandlers.ofByteArray()
        );
        
        if (response.statusCode() == 200) {
            Files.write(Paths.get("speech.mp3"), response.body());
            System.out.println("Audio saved to speech.mp3");
        } else {
            System.out.println("Error: " + response.statusCode());
        }
    }
}

Response

The endpoint returns audio data directly with the following headers:

Header Value Description
Content-Type audio/mpeg Audio format (MP3)
Content-Length number Size of audio file in bytes

Voice Speed Examples

Speed Effect Use Case
0.5 50% slower (half speed) Educational content, accessibility
0.75 25% slower Clear pronunciation, language learning
1.0 Normal speed (default) Standard conversation
1.2 20% faster Quick updates, notifications
1.5 50% faster Rapid information delivery
2.0 2x speed (double speed) Maximum speed, time-saving

Backend Implementation Example

// Your backend endpoint
app.post('/api/generate-speech', async (req, res) => {
  const { text, voiceSpeed = 1.0 } = req.body;
  
  // Call TTP TTS API
  const ttpResponse = await fetch('https://backend.talktopc.com/api/public/agents/tts/generate', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.TTP_API_KEY}`  // ๐Ÿ”’ Secret!
    },
    body: JSON.stringify({
      text: text,
      voiceSpeed: voiceSpeed,
      voiceId: 'nova',
      language: 'en'
    })
  });
  
  if (!ttpResponse.ok) {
    return res.status(ttpResponse.status).json({ 
      error: 'TTS generation failed' 
    });
  }
  
  // Forward audio to client
  const audioBuffer = await ttpResponse.arrayBuffer();
  res.set('Content-Type', 'audio/mpeg');
  res.send(Buffer.from(audioBuffer));
});

Error Responses

Status Code Description
400 Bad Request - Invalid parameters
401 Unauthorized - Invalid or missing API key
429 Too Many Requests - Rate limit exceeded
500 Internal Server Error - TTS generation failed
โš ๏ธ Security Best Practices:
  • Never expose your API key in frontend JavaScript
  • Always call this endpoint from your backend server
  • Implement rate limiting on your backend
  • Validate and sanitize text input to prevent abuse

Use Cases

๐Ÿ“ข Announcements

Generate audio announcements for notifications

๐Ÿ“š Content Creation

Convert articles or books to audio format

โ™ฟ Accessibility

Provide audio alternatives for text content

๐ŸŽ“ E-Learning

Create voice-overs for educational materials

Java SDK

Server-side Java SDK for text-to-speech conversion. Perfect for backend applications, phone systems, and server-to-server integrations.

๐ŸŽฏ Use Cases:
  • Backend TTS: Generate speech on your server without exposing API keys
  • Phone Systems: Integrate with Twilio, Telnyx, or custom VoIP systems
  • Server-to-Server: Automated voice generation for notifications, alerts, or content
  • Audio Format Control: Request specific formats (PCMU, PCMA, PCM) for phone systems

Installation

Maven

<repositories>
    <repository>
        <id>github</id>
        <url>https://maven.pkg.github.com/TTP-GO/java-sdk</url>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>com.talktopc</groupId>
        <artifactId>ttp-agent-sdk-java</artifactId>
        <version>1.0.5</version>
    </dependency>
</dependencies>
โš ๏ธ GitHub Packages Authentication:

You'll need to authenticate with GitHub Packages. Add credentials to your ~/.m2/settings.xml:

<settings>
    <servers>
        <server>
            <id>github</id>
            <username>YOUR_GITHUB_USERNAME</username>
            <password>YOUR_GITHUB_TOKEN</password>
        </server>
    </servers>
</settings>

Gradle

repositories {
    maven {
        url = uri("https://maven.pkg.github.com/TTP-GO/java-sdk")
        credentials {
            username = project.findProperty("gpr.user") ?: System.getenv("USERNAME")
            password = project.findProperty("gpr.key") ?: System.getenv("TOKEN")
        }
    }
}

dependencies {
    implementation 'com.talktopc:ttp-agent-sdk-java:1.0.5'
}

Quick Start

1. Initialize SDK

import com.talktopc.sdk.VoiceSDK;

// Get API key from environment variable
String apiKey = System.getenv("TALKTOPC_API_KEY");

VoiceSDK sdk = VoiceSDK.builder()
    .apiKey(apiKey)
    .baseUrl("https://api.talktopc.com")  // Optional
    .build();

2. Simple TTS (Blocking)

// Generate complete audio file
byte[] audio = sdk.textToSpeech("Hello world", "mamre");

// Save to file
Files.write(Paths.get("output.wav"), audio);

// Or send to phone system
phoneSystem.playAudio(audio);

3. Streaming TTS (Real-time)

// Stream audio chunks as they're generated
sdk.textToSpeechStream(
    "Hello world, this is a longer text that will be streamed",
    "mamre",
    audioChunk -> {
        // Receive chunks in real-time
        phoneSystem.playAudio(audioChunk);
    }
);

API Reference

VoiceSDK

Main SDK entry point for text-to-speech operations.

Builder Methods

Method Type Description
apiKey(String) String Your TalkToPC API key (required)
baseUrl(String) String API base URL (default: https://api.talktopc.com)
connectTimeout(int) int Connection timeout in milliseconds (default: 30000)
readTimeout(int) int Read timeout in milliseconds (default: 60000)

Methods

Method Description
textToSpeech(String text, String voiceId) Simple TTS (blocking) - returns complete audio as byte array
textToSpeech(String text, String voiceId, double speed) TTS with speed control (0.1 - 3.0)
textToSpeech(TTSRequest request) TTS with full configuration (format, speed, etc.)
synthesize(TTSRequest request) Get full response with metadata (sample rate, duration, credits)
textToSpeechStream(String text, String voiceId, Consumer<byte[]> chunkHandler) Streaming TTS - chunks delivered to handler as they're generated
textToSpeechStream(TTSRequest request, Consumer<byte[]> chunkHandler, Consumer<StreamMetadata> onComplete, Consumer<Throwable> onError) Streaming TTS with completion and error callbacks

TTSRequest Builder

Configure TTS requests with audio format options.

Basic Configuration

TTSRequest request = TTSRequest.builder()
    .text("Hello world")              // Required
    .voiceId("mamre")                // Required
    .speed(1.0)                      // Optional (0.1 - 3.0)
    .build();

Audio Format Configuration

TTSRequest request = TTSRequest.builder()
    .text("Hello world")
    .voiceId("mamre")
    .outputContainer("raw")          // "raw" or "wav"
    .outputEncoding("pcm")           // "pcm", "pcmu", "pcma"
    .outputSampleRate(16000)         // Hz (8000, 16000, 22050, 44100)
    .outputBitDepth(16)              // bits (8, 16, 24)
    .outputChannels(1)               // 1 (mono) or 2 (stereo)
    .outputFrameDurationMs(600)     // ms per frame (for streaming)
    .build();

Preset Methods

Method Format Use Case
phoneSystem() PCMU @ 8kHz, 20ms frames Phone systems (Twilio, Telnyx, VoIP)
highQuality() WAV @ 44.1kHz High-quality audio files
standardQuality() PCM @ 22.05kHz Standard quality audio

TTSResponse

Response object containing audio and metadata.

Method Return Type Description
getAudio() byte[] Audio data
getSampleRate() int Sample rate in Hz
getDurationMs() long Playback duration in milliseconds
getAudioSizeBytes() long Audio size in bytes
getCreditsUsed() double Credits consumed
getConversationId() String Unique conversation ID

Examples

Basic Usage

VoiceSDK sdk = VoiceSDK.builder()
    .apiKey(System.getenv("TALKTOPC_API_KEY"))
    .build();

// Simple TTS
byte[] audio = sdk.textToSpeech("Welcome to TalkToPC", "mamre");
System.out.println("Generated " + audio.length + " bytes of audio");

// Save to file
Files.write(Paths.get("output.wav"), audio);

With Speed Control

// Faster speech (1.5x speed)
byte[] fastAudio = sdk.textToSpeech("Quick message", "mamre", 1.5);

// Slower speech (0.8x speed)
byte[] slowAudio = sdk.textToSpeech("Slow and clear", "mamre", 0.8);

Streaming with Metadata

sdk.textToSpeechStream(
    TTSRequest.builder()
        .text("Streaming example with full configuration")
        .voiceId("mamre")
        .speed(1.0)
        .build(),
    audioChunk -> {
        // Handle each audio chunk
        System.out.println("Received chunk: " + audioChunk.length + " bytes");
        phoneSystem.playAudio(audioChunk);
    },
    metadata -> {
        // Handle completion
        System.out.println("Stream completed:");
        System.out.println("  Total chunks: " + metadata.getTotalChunks());
        System.out.println("  Total bytes: " + metadata.getTotalBytes());
        System.out.println("  Duration: " + metadata.getDurationMs() + " ms");
        System.out.println("  Credits: " + metadata.getCreditsUsed());
    },
    error -> {
        // Handle errors
        System.err.println("Stream error: " + error.getMessage());
    }
);

Phone System Integration

Perfect for Twilio, Telnyx, or custom VoIP systems.

Standard Phone System (PCMU @ 8kHz)

// Using convenient phoneSystem() preset
TTSRequest request = TTSRequest.builder()
    .text("Hello, thank you for calling. How can I help you today?")
    .voiceId("en-US-female")
    .phoneSystem()  // โœ… PCMU @ 8kHz, 20ms frames
    .build();

sdk.textToSpeechStream(
    request,
    audioChunk -> {
        // audioChunk is PCMU @ 8kHz, 20ms frames (160 bytes)
        // Ready to send directly to phone connection
        phoneConnection.sendAudio(audioChunk);
    }
);

Twilio Integration

TTSRequest request = TTSRequest.builder()
    .text("Your appointment is confirmed for tomorrow at 3 PM")
    .voiceId("en-US-male")
    .outputContainer("raw")
    .outputEncoding("pcmu")      // ฮผ-law for Twilio
    .outputSampleRate(8000)       // 8kHz
    .outputBitDepth(16)
    .outputChannels(1)            // Mono
    .outputFrameDurationMs(20)    // 20ms frames
    .build();

sdk.textToSpeechStream(
    request,
    audioChunk -> {
        // Send to Twilio Media Stream
        twilioStream.sendMedia(audioChunk);
    }
);

Custom Audio Format

TTSRequest request = TTSRequest.builder()
    .text("Custom format example")
    .voiceId("mamre")
    .outputContainer("raw")
    .outputEncoding("pcm")
    .outputSampleRate(16000)      // 16kHz
    .outputBitDepth(16)           // 16-bit
    .outputChannels(1)            // Mono
    .outputFrameDurationMs(100)   // 100ms frames
    .build();

byte[] audio = sdk.textToSpeech(request);
// Expected: 16kHz PCM, 16-bit, mono

High Quality Audio

TTSRequest request = TTSRequest.builder()
    .text("This is a high quality recording")
    .voiceId("mamre")
    .highQuality()  // WAV @ 44.1kHz
    .build();

byte[] audio = sdk.textToSpeech(request);
Files.write(Paths.get("high_quality.wav"), audio);

Error Handling

import com.talktopc.sdk.exception.TtsException;

try {
    byte[] audio = sdk.textToSpeech("Test", "mamre");
} catch (TtsException e) {
    System.err.println("TTS Error [" + e.getStatusCode() + "]: " + e.getErrorMessage());
    
    switch (e.getStatusCode()) {
        case 401:
            System.err.println("โ†’ Invalid API key");
            break;
        case 402:
            System.err.println("โ†’ Insufficient credits");
            break;
        case 400:
            System.err.println("โ†’ Invalid parameters");
            break;
        default:
            System.err.println("โ†’ Other error");
    }
}

Full Configuration Example

import com.talktopc.sdk.models.TTSRequest;
import com.talktopc.sdk.models.TTSResponse;

// Build request with all options
TTSRequest request = TTSRequest.builder()
    .text("Full configuration example")
    .voiceId("mamre")
    .speed(1.2)
    .outputContainer("wav")
    .outputEncoding("pcm")
    .outputSampleRate(44100)
    .outputBitDepth(16)
    .outputChannels(1)
    .build();

// Get response with metadata
TTSResponse response = sdk.synthesize(request);

System.out.println("Audio: " + response.getAudioSizeBytes() + " bytes");
System.out.println("Sample rate: " + response.getSampleRate() + " Hz");
System.out.println("Duration: " + response.getDurationMs() + " ms");
System.out.println("Credits: " + response.getCreditsUsed());

// Save audio
Files.write(Paths.get("output.wav"), response.getAudio());

Supported Audio Formats

Format Encoding Sample Rates Use Case
PCM pcm 8000, 16000, 22050, 24000, 44100 Hz General purpose, high quality
PCMU (ฮผ-law) pcmu 8000 Hz Phone systems (Twilio, Telnyx, VoIP)
PCMA (A-law) pcma 8000 Hz Phone systems (European standards)

Requirements

  • Java 11 or higher
  • Valid TalkToPC API key
  • No external dependencies - Uses Java 11+ HttpClient
๐Ÿ’ก Key Differences from Frontend SDK:
  • Backend-only: Designed for server-side use, not browser
  • Format Pass-through: Can request PCMU/PCMA and forward directly to phone systems
  • No Audio Playback: Returns raw audio bytes - you handle playback/forwarding
  • REST API: Uses REST endpoints instead of WebSocket

Resources

  • GitHub Repository: TTP-GO/java-sdk
  • Maven Package: com.talktopc:ttp-agent-sdk-java:1.0.5
  • Documentation: See README.md in the repository
  • Examples: Check src/main/java/com/talktopc/sdk/examples/