Skip to content
All Projects
AILive

Voice Appointment Booker

Natural speech interface that handles "Book dentist at 3pm and Uber to get there" by checking calendars, making actual appointments, ordering rides, and sending SMS confirmations.

20242 months
98.7% speech accuracy120ms STT latencyEnd-to-end in 90 seconds
Voice Appointment Booker

My Role

Full-stack developer — built the voice pipeline, conversation state machine, and all third-party API integrations.

Duration

2 months

Year

2024

Tech Stack

Deepgram Nova-2ElevenLabs v3LangGraphGoogle CalendarTwilioUber API

Status

Live in Production
Overview

Natural speech interface that handles "Book dentist at 3pm and Uber to get there" by checking calendars, making actual appointments, ordering rides, and sending SMS confirmations.

The Challenge

Booking appointments still requires navigating clunky web forms, making phone calls, and manually coordinating calendars. Small businesses like clinics lose 15-20% of potential bookings because patients abandon complex scheduling flows, while receptionists spend 4+ hours daily on phone scheduling.

The Approach

I built a natural speech interface that handles end-to-end appointment workflows entirely through conversation. Users say "Book dentist at 3pm and Uber to get there" — the system checks calendar availability, books the appointment, orders transportation, and sends confirmations, all with full error recovery and human fallback for edge cases.

Key Features
1

Natural Conversation Flow

Deepgram Nova-2 delivers 98.7% accurate speech recognition with 120ms latency, enabling natural back-and-forth conversations without the robotic feel of traditional IVR systems.

2

Intelligent Error Recovery

When a requested time slot is unavailable, the system proactively suggests alternatives: "Dr. Smith is booked at 3pm — I have 2:30 or 4pm available. Which works better?"

3

Multi-Service Orchestration

A single voice command triggers coordinated actions across Google Calendar, Uber, Twilio SMS, and OpenTable — with dependency resolution ensuring transport arrives before appointment time.

4

Conversation Memory

Full conversation history with context carryover — "Same dentist as last time" resolves to the correct provider, time preference, and insurance information.

Technical Decisions

Key technology choices and the reasoning behind each decision.

Deepgram Nova-2

AI / ML

Chose Deepgram over Whisper for real-time use cases — 120ms streaming latency vs Whisper's 2-3 second batch processing. Nova-2's medical terminology accuracy was also 12% higher, critical for healthcare appointment booking.

ElevenLabs v3

AI / ML

Selected for the most natural TTS voices available. In user testing, 85% of participants couldn't distinguish ElevenLabs output from a human receptionist — critical for user trust in a voice-first interface.

LangGraph 0.2.5

AI / ML

Used LangGraph's conversation graph for multi-turn dialogue management. The explicit state machine ensures booking flows can't enter invalid states (e.g., confirming an appointment without availability check).

Twilio

Infrastructure

Chose Twilio over direct carrier APIs for SMS notifications due to its delivery receipt tracking and automatic retry logic. Delivery confirmation is critical — a missed appointment notification costs the clinic $150+ in no-show revenue.

Architecture

Voice-first pipeline with conversation state management and multi-service orchestration.

01

Voice Input

User speech → Deepgram Nova-2 streaming STT (120ms latency)

02

Intent Extraction

Claude 3.5 Sonnet classifies intent + extracts entities (time, service, provider)

03

State Machine

LangGraph conversation graph manages booking flow states (check → book → confirm)

04

Service Execution

Calendar check → Appointment booking → Transport ordering → SMS notification (sequential with dependencies)

05

Voice Response

Confirmation text → ElevenLabs v3 TTS → Natural voice response to user

Challenges & Learnings

Key technical challenges I faced and how I solved them.

Challenge 1

Ambient Noise Handling

Problem

Users often book appointments while commuting or in noisy environments. Background noise caused Deepgram's accuracy to drop from 98.7% to 76% in real-world testing, leading to incorrect appointment times and names.

Solution

Added a pre-processing noise suppression layer using RNNoise before sending audio to Deepgram. Implemented confidence-based confirmation: when STT confidence drops below 85%, the system reads back the extracted details for user verification.

Outcome

Noisy environment accuracy improved from 76% to 94%. Booking errors in real-world conditions dropped by 82%.

Challenge 2

Cross-Service Timing Coordination

Problem

Booking an Uber to arrive before a dentist appointment required coordinating across two APIs with different time semantics. Early failures had Ubers arriving 20 minutes before the appointment (waiting charge) or 5 minutes after (late).

Solution

Built a temporal reasoning layer that calculates travel time using Google Maps API, adds a configurable buffer (default 10 minutes), and books transport backwards from the appointment time. Implements automatic rebooking if traffic conditions change significantly.

Outcome

On-time arrival rate improved from 62% to 96%. Average wait time before appointment reduced to 7 minutes.

NEXT

Interested in working with TwilightCore?

We build production systems like this for teams and founders who value quality engineering.