AI in Production: Evals & Observability Workshop
๐ Best Practices in Evals & Observability Workshop ๐

Building LLM-based applications is exciting, but keeping them reliably performant in production is a challengeโeven for the best teams. How do you track user interactions? Diagnose issues? Prevent regressions when prompts or models change? At Weights & Biases, weโve worked with hundreds of companies to tackle questions like these.
Join us for this exclusive workshop, where weโll guide you through building and evaluating an AI application using a proven, future-proof workflow, learning how to:
- Log and analyze system behavior.
- Collect and leverage user feedback.
- Design evaluations to identify and resolve issues.
- Confidently fine-tune and optimize performance.
No prior experience with evaluations? No problem. Just bring your laptop, and leave with practical skills to future-proof your AI applications.
๐ Event Details
- When: January 13, 2025 (Monday)
- Time: 5:00 PM - 9:00 PM
- Where: Downtown Seattle (Address available to confirmed attendees)
๐ Agenda
- 5:00 PM - 5:30 PM: Arrival and Food ๐ฅช
-
5:30 PM - 8:00 PM: Workshop: Best Practices in Evals & Observability
- Understanding evaluation metrics for AI models
- Hands-on Lab: Building evaluation pipelines
- Interactive Session: Implementing observability tools
- Real-world case studies and troubleshooting
- 8:00 PM - 8:30 PM: Q&A + Open Discussion
- 8:30 PM - 9:00 PM: Networking
๐ป Bring your laptop: youโll be building real evaluation pipelines and implementing observability tools throughout this hands-on workshop. Setup instructions and prep materials will be provided in advance.
โจ Special Guests
Alex Volkov
AI Evangelist @ Weights & Biases
Alex Volkov is a leading AI practitioner and evangelist at Weights & Biases, as well as the host of the popular Thursd/AI webcast, which attracts thousands of live listeners each week. Known for his ability to stay ahead of industry trends, Alex combines hands-on experience with a deep understanding of the complex landscape of emerging AI tools, evaluation methodologies, and observability best practices. His work helping practitioners move beyond prototypes into reliable, production-scale AI systems makes him an invaluable voice in the field.
๐ Who Should Attend? ๐ค
This session is curated for AI developers, data scientists, and engineers who:
Are actively building or managing AI pipelines
- Want to level up their evaluation practices, whether youโre currently going by vibes or using sophisticated metrics
- Are looking to improve how you track and analyze the results youโre getting from large models in production
- Are interested in learning about emerging tools and techniques in this rapidly evolving field
๐ The reality is: many developers are still evaluating models based on โจvibesโจ and tracking results in ๐ spreadsheets - we welcome you to this session to get a ๐ boost and a level-up! Whether youโre just starting to formalize your evaluation process or already have robust systems in place, this session will help you take your next step forward. ๐ฏ
Bring Your Laptop: Code along with live examples and leave with practical tools.
Whether youโre looking to improve your model evaluation processes or enhance your observability practices, this workshop is for you. Gain actionable insights, connect with like-minded peers, and explore the future of AI development.
๐ Sponsor
This event is sponsored by Weights & Biases Weave. When youโre ready to take your tinkering to production, youโll need reliable logging and tracing, Weave will get you there with 3 lines of code, and youโll get a beautiful dashboard you can share with even the non technical folks! Plus, once youโre ready to upgrade to o1 or gpt-next, Weave offers a robust LLM evaluation pipeline so youโll always know which LLM works best for your use-case.

โฌ๏ธ Space is very limited for this hands-on session. Please register now! โฌ๏ธ


AI in Production Event Series
The AI in Production series by AI Tinkerers explores the realities of deploying AI systems at scale. By inviting experts and experienced practitioners to share their insights, the series emphasizes best practices, advanced tooling, and real-world applicationsโhelping teams close the gap between experimentation and business-ready systems that create value.
About AI Tinkerers
AI Tinkerers is an exclusive global network of top AI talent hosting in-person events in over 60 cities worldwide. We bring together active builders, practitioners, and experts who are at the forefront of AI development, particularly in areas like large language models and generative AI. Our community is dedicated to fostering technical innovation and collaboration among highly skilled professionals actively shaping the future of artificial intelligence.
- Attendees: This exclusive community of 5,394 technical professionals features a high-signal membership where 35% are CTOs, founders, or principal architects. Key skill areas are balanced across generative AI orchestration (35%), distributed systems infrastructure (35%), and applied machine learning research (30%). Notable achievements include members deploying production-grade agentic architectures, pioneering open-source AI developer tooling, and optimizing GPU-compiler performance at scale.
- Companies Represented: Featuring tech giants like Amazon, Microsoft, Meta, Apple, and Google, alongside innovative AI platforms and startups such as OpenAI, Anthropic, OpenPipe, Fixie.ai, CopilotKit, and more
