Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Ultravox: Open Source Speech LM
An overview of Ultravox, an open‑source multimodal LLM that directly processes speech, covering architecture, API usage, and a live end‑to‑end demonstration.
Ultravox is a new multimodal LLM that is able to directly understand speech (unlike current voice AI stacks, it does not require a separate speech recognition stage). This approach makes voice AI applications faster, more robust, and allows them to understand the non-textual parts of speech.
It builds on a Llama 3 backbone which means that it can be trained much faster than a typical foundation model. We’ve just open-sourced Ultravox at https://ultravox.ai and are working on growing a community around it.
Ultravox: Direct speech SLM for scalable, low-latency voice AI.