Voice Chat Client Package

React hook for real-time voice conversations with ElevenLabs Conversational AI

Overview

@zooly/voice-chat-client is a pure React client-side package that provides the useVoiceConversation hook for managing real-time voice conversation sessions with ElevenLabs Conversational AI.

Package Details

  • Package Name: @zooly/voice-chat-client
  • Location: packages/voice-chat/client
  • Type: React hook library (client-side)

Key Features

  • Voice Session Management: Start and end ElevenLabs conversational AI sessions
  • Microphone Permission Handling: Permission probe pattern for browser mic access
  • Message Accumulation: Normalised transcript of user and assistant messages
  • Voice Call Log Persistence: Fire-and-forget backend log updates via injected API adapter
  • Audio Analysis: Real-time Web Audio API volume analysis for user speaking detection
  • Derived UI State: isAiSpeaking, isUserSpeaking, conversationMode, getStateLabel

Dependencies

  • @elevenlabs/react — ElevenLabs React SDK
  • @zooly/types — Shared types (VoiceCallMessage)
  • @zooly/util — Shared utilities
  • react (peer) — React 18 or 19

Hook API

Input — UseVoiceConversationProps

interface UseVoiceConversationProps {
  accountId: string | null;
  voiceCallLogApi: VoiceCallLogApi;
  onConnect?: () => void;
  onDisconnect?: () => void;
  onError?: (error: any) => void;
  onMessage?: (message: VoiceCallMessage) => void;
  onCallStarted?: (params: { agentId: string }) => void;
  onCallEnded?: (params: { agentId: string }) => void;
}
PropPurpose
accountIdCurrent account ID, passed to voice call log creation
voiceCallLogApiInjected API adapter for creating and updating voice call logs
onConnectCalled when the ElevenLabs session connects
onDisconnectCalled when the session disconnects
onErrorCalled on session errors
onMessageCalled with each normalised message (user or assistant)
onCallStartedCalled after session starts — wire analytics here
onCallEndedCalled after session ends — wire analytics here

VoiceCallLogApi Adapter

interface VoiceCallLogApi {
  create: (params: { accountId: string | null; agentId: string }) => Promise<{ id: string }>;
  update: (params: { id: string; voiceCallMessages: VoiceCallMessage[] }) => Promise<void>;
}

The consuming app implements this interface to bridge to its own API routes. This avoids hardcoding fetch URLs inside the package.

Output — UseVoiceConversationReturn

FieldTypeDescription
agentIdstringElevenLabs agent ID for the current session
isCallOnGoingbooleanWhether a voice session is active
isStartVoiceCallbooleanUI toggle for starting a voice call
setIsStartVoiceCall(v: boolean) => voidSetter for the toggle
voiceCallMessagesVoiceCallMessage[]Accumulated transcript
setVoiceCallMessages(msgs) => voidReplace the transcript
addToVoiceCallMessages(msg) => voidAppend a message (e.g. synthetic errors)
isInitializingVoiceCallbooleanTrue while session is being established
isPermissionGrantedbooleanWhether mic permission has been granted
startConversation(agentId: string) => Promise<void>Start a session
endConversation() => Promise<void>End the current session
requestMicrophonePermission() => Promise<void>Request mic access
currentVoiceCallLogIdstring | nullDB ID of the current voice call log
getConversationId() => string | undefinedElevenLabs conversation ID
isAiSpeakingbooleanTrue when the AI agent is speaking
isUserSpeakingbooleanTrue when the AI is not speaking
conversationMode"speaking" | "listening"From the AI's perspective
conversationSDK returnRaw useConversation return value
getStateLabel() => stringHuman-readable status label

Conversation Lifecycle

  1. Consumer calls startConversation(agentId) with the ElevenLabs agent ID
  2. Hook requests mic permission if not already granted
  3. ElevenLabs session opens via conversation.startSession()
  4. A voice call log is created via voiceCallLogApi.create()
  5. Messages arrive via the SDK's onMessage callback and are normalised into VoiceCallMessage objects
  6. Each message update fires voiceCallLogApi.update() (fire-and-forget, best-effort)
  7. Consumer calls endConversation() to tear down the session

Audio Analysis

When the session is connected and isCallOnGoing is true, the hook sets up a Web Audio API analyser on the microphone input. The analyser computes average volume from frequency data using fftSize = 256 and smoothingTimeConstant = 0.8. The analysis runs via requestAnimationFrame and is cleaned up when the session ends.

Usage Example

See App Integration for how to wire this hook into zooly-app.