Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
DoppelGoner: Vector Entity Clustering
Learn how to build a Rust-based vector clustering system that uses transformer embeddings, incremental processing, and graph algorithms to reconcile entities across federated databases.
I’ll demonstrate how I built DoppelGoner, an open-source Rust implementation that uses transformer embeddings and graph-based clustering to solve entity reconciliation across federated databases. The demo will include:
Live code walkthrough of the vector similarity pipeline using BGE-small embeddings
Technical deep-dive into the incremental processing architecture that enables efficient repeated runs
Demonstration of the graph-based cluster consolidation using petgraph for transitive relationship discovery
Performance optimization techniques for pgvector operations and parallel embedding generation
Live demo of semantic service matching where I’ll show how the system identifies semantically similar services even with different terminology
I’ll run everything live on my MacBook showing that sophisticated AI tooling can be deployed efficiently without massive compute resources.
Rust/TypeScript DoppelGoner thoughtfully deduplicates HSDS entities from multiple HSDS data sources.
This Rust/TypeScript project demonstrates semantic search using vectorized data.