Alleged Llama 3.1 Leak: 405 Billion Parameter Open-Source Model Surpassing GPT-4 Emerges

The advantages of proprietary technology are diminishing. With the development and popularization of open-source technologies, closed-source systems once viewed as competitive barriers are facing challenges. The open collaboration model is changing the landscape of the software industry, making it increasingly difficult to maintain moat strategies that rely on closed technologies. Companies need to rethink how to maintain competitiveness in an open environment.

Llama 3.1 has reportedly leaked, including benchmark results for 8B, 70B and 405B parameter models. Even the 70B version outperforms GPT-4o on several benchmarks, marking the first time an open-source model has surpassed closed-source models like GPT-4o and Claude Sonnet 3.5 on multiple benchmarks.

Key details from the leaked model card:

  • Trained on 15T+ tokens of publicly available data up to December 2023
  • Fine-tuning data includes public instruction datasets and 15 million synthetic samples
  • Supports English, French, German, Hindi, Italian, Portuguese, Spanish and Thai

The models reportedly have a 128k context length and use grouped-query attention for improved inference scalability.

Intended uses include multilingual commercial applications and research. The instruction-tuned models are optimized for assistant-like chat, while pre-trained models can be adapted for various natural language generation tasks.

Training infrastructure:

  • Custom training library and Meta's GPU clusters
  • 39.3M GPU hours on H100-80GB hardware
  • Estimated 11,390 tons CO2e emissions (0 tons market-based due to renewable energy use)

Benchmark scores are reported for various tasks, with Llama 3.1 models outperforming many open and closed-source chat models.

Safety considerations:

  • Multi-pronged data collection approach combining human-generated and synthetic data
  • LLM-based classifiers for quality control
  • Focus on reducing model refusals and refusal tone
  • Adversarial prompts incorporated into safety data
  • Intended for deployment as part of a larger AI system with additional safeguards

Developers should implement system-level safety measures when building agent systems, especially when utilizing new features like longer context windows, multilingual capabilities, and third-party tool integrations.

[Links to referenced papers and sources have been omitted]