An anonymous model codenamed sus-column-r has recently performed excellently in the large model arena, sparking much speculation. Yesterday, Musk finally revealed the mystery - this is xAI's upcoming new model, Grok2.
Grok2 performed well in the officially released battle data, achieving high win rates against other mainstream models such as GPT-4o and Claude 3.5 Sonnet, except for Google's Gemini 1.5 Pro. In various benchmark tests, Grok2's capabilities are also comparable to top AI models.
A major upgrade for Grok2 is the addition of image functionality, achieved through collaboration with FLUX.1. Tests have found that Grok2 is bolder in image generation, capable of producing some controversial content, such as spoofing public figures. This could potentially bring some legal risks.
In practical use, Grok2 performs well on some basic questions, such as decimal comparisons and counting, which are common AI pitfalls. Its answers are usually quite detailed. However, on some questions requiring deeper understanding, GPT-4o still has an advantage.
Overall, Grok2 has indeed demonstrated significant capability improvements, especially in areas like mathematics. However, it still has gaps compared to other top AI models and needs further improvement. This release shows xAI's ambition and progress in the AI field.