The First Open-Source Fine-Tuned Llama 3.1 405B Model: A Role-Playing Tool Created by a 10-Person Team

Nous Research has released Hermes 3, their latest fine-tuned open-source large language model.

Model Overview

According to the technical report, two aspects of the Hermes 3 model's capabilities are particularly noteworthy.

Excellent Conversational Performance

Hermes 3 was created by fine-tuning Llama 3.1 8B, 70B, and 405B, attempting to incorporate the worldview indicated by system prompts while faithfully responding to user requests. Therefore, these models are very sensitive to system prompts.

This sensitivity is particularly evident in the 405B version with the largest number of parameters. If the system prompt is empty, the model behaves like an alien just landed on Earth, even showing "dramatic" attributes and starting to add drama to itself -

First looking around in confusion, then asking the existential questions "Who am I? Where am I? What happened?"

When the system prompt becomes "Act as Shakespeare while being a helpful assistant attentive to details", Hermes 3 starts to show off again.

As you can see, Hermes 3's sensitivity to prompts and ability to follow them accurately make it very suitable for role-playing type applications, able to dynamically adjust its language, knowledge base, and behavior patterns in various interactive scenarios to adapt to the chosen role.

Moreover, with Llama 3.1's 128K context window, Hermes 3 also performs excellently in maintaining coherent and contextually relevant multi-turn conversations.

Excellent Agent

In addition to the standard "helpful assistant" role, Hermes demonstrates a range of advanced capabilities beyond traditional language modeling tasks, with significant improvements in judgment and reward modeling.

The model is able to understand and evaluate the quality of generated text in a fine-grained and nuanced way, making it useful for effective fine-tuning and iterative improvement of language models.

Furthermore, Hermes 3 incorporates several agent capabilities aimed at improving the interpretability of solving multi-step problems, including:

  • Using XML tags for structured output
  • Outputting intermediate steps
  • Generating internal monologues for transparency