About Callgeist
The why's and how's of Callgeist
The why
Callgeist was born from a personal need. Recently, I started playing around with Twilio plus OpenAI's realtime API to create phone call-based voice agents. My first idea was to create an agent for car dealerships, following the motto: Show don't tell.
I quickly found myself calling my own agent before any meeting because I was afraid it might be down. Around this time, I came up with the idea of building an UptimeRobot version specifically for AI Voice agents.
When I told some friends of mine, who work in the call center / voice agent space, about my idea, they quickly told me they were facing the same issue I had. They encouraged me to build a tool they could use as well.
And so Callgeist was born.
Fun Fact
This project was built in just one week, showcasing the power of AI-assisted development in modern software creation. (This was obviously written by an LLMs)
The how
To get this project shipped as fast as possible, I used mostly tools I'm pretty familiar with plus the excellent Cursor IDE.
Setup
For the setup process I did not use any AI tools. I tried at first using Cursor to set up a Next15 + Shadcn + Supabase project but that didn't work very well. Instead I did it by hand.
From some previous experiments I've learned that it's best for Cursor to install all Shadcn components beforehand.
UI
To create the UI I went full in on Cursor. Since I wanted to make the UI and UX be similar to the actual UptimeRobot page (and Supabase) I simply took screenshots of their pages and gave them to Cursor. Cursor was able to generate the landing page plus all other pages. Of course I had to tell it at times it should make it more mobile friendly etc. But all in all, I saw myself only needing to change about 1% of the UI code myself.
For the icons and video I used DALL-E and Sora. It worked pretty well. However, Sora still struggles a lot with remixes. For example, it was not able to create the landing page video with a ghostly face, as I wanted to use it for the 404 page.
Call Monitoring
Building the actual call monitoring was the most difficult part. Luckily Supabase has recently added cron jobs (via pg_cron) and websocket support to their edge functions. Which allowed me to do everything within the Supabase platform.
It took me a while to understand how the Twilio API works (start call, start stream, listen to websocket connection) and the OpenAI realtime API. I think the hardest part was to get the function calling to work (Sadly it's still a bit flaky at times) and to automatically close calls as soon as the AI was confident enough to know who attended the call.
About Voice User Interfaces (VUI)
I'm personally very invested in Voice User Interfaces (VUI). At home, my son and I use ChatGPT's voice capabilities quite a lot. For example, we ask it to pretend to be a hotel receptionist and we play going on holiday.
What I love about VUIs is that interacting with them comes naturally to humans. I can see my parents and my kid interact with VUIs without needing any training. They simply talk to it like they do to any other person. I really believe VUIs will change the way people in general interact with computers in the near future.
Of course I mustn't hide that my PhD thesis was about how voice agents might be used to influence people on a greater scale. I wrote my thesis about 8 years ago, but only now are the things we thought about becoming reality. It's a bit scary but also pretty cool.