There is a deep connection between the open-source weightwatcher tool, which implements ideas from the theory of Heavy Tailed Self-Regularization…
Fine-Tuned Llama3.2: Bad Instructions ?
Recently, Meta released LLama3.2 1B and 3B Instruct Fine Tuned LLM. To mixed reviews. On the one hand, it’s ranking…
What’s instructive about Instruct Fine-Tuning: a weightwatcher analysis
Are you Fine-Tuning an open-source LLMs ? Like Llama, Mistral, or Qwen? A That is, Instruct Fine Tuning. Whether you…
Describing Double Descent with WeightWatcher
Double Descent (DD) is something that has surprised statisticians, computer scientists, and deep learning practitioners–but it was known in the…
SVDSmoothing LLM Layers with WeightWatcher
Recently, Microsoft Research published the LASER method: ”Layer-Selective Rank Reduction” in this recent, very popular paper The Truth is in There:…
Evaluating LLMs with WeightWatcher Part III: The Magic of Mistral, a Story of Dragon Kings
Recently, the Mistral models have taken the LLM world by storm. The Mistral Mixture of Experts (MOE) 8x7b model outperforms other…
Evaluating Fine-Tuned LLMs with WeightWatcher Part II: PEFT / LoRa Models
Evaluating LLMs is hard. Especially when you don’t have a lot of test data.In the last post, we saw how to…
Evaluating Fine-Tuned LLMs with WeightWatcher
if you are fine-tuning your own LLMs, you need a way to evaluate them. And while there are over a dozen…
WeightWatcher new feature: fix_fingers=’clip_xmax’
WeightWatcher 0.7 has just been released, and it includes the new and improved advanced feature for analyzing Deep Neural Networks…
WeightWatcher 0.7: March 2023
First, let me say thanks to all the users in our great community — we have reached over 93K downloads…