Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Llama 2, Mistral Fine-Tuning Benchmarks
This talk compares fine-tuning performance of Llama 2, Mistral, OpenHermes 2.5, Zephyr, and GPT-3.5 through detailed benchmark evaluations.
We’ve run a bunch of benchmarks of fine-tuned open-source models and compare them head to head. These include the strongest base models (Llama 2 and Mistral) as well as several open-source instruct-tuned variants that have made a splash recently like OpenHermes 2.5 and Zephyr. We also have evaluated against fine-tuning OpenAI GPT-3.5 itself, and gotten some surprising results!
OpenPipe fine-tunes LLMs from expensive prompts for cheaper, OpenAI-compatible inference.