Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Orca
This talk covers Orca, a Rust-based LLM orchestration tool enabling local inference with open-source models, using Candle and exploring WebAssembly for browser-based ML.
Orca is an (in development) LLM orchestration written in Rust. It allows for LLM orchestration with a focus on allowing local LLM inference in both your machine and (in the future), running fully on your browser using WebAssembly! No internet connection or API calls are required.
Rust LLM Orchestrator framework implements Qdrant vector store and sequential pipelines.
Candle: Minimalist Rust ML framework enabling fast CUDA, WASM, and CPU inference.