*The project codenamed "Strawberry" has recently received widespread attention. An account called "Strawberry Bro" has been continuously promoting related information, generating both anticipation and disappointment.
Recently, the founder of AI agent startup MultiOn claimed they released a new intelligent agent called Agent Q, which controls the "Strawberry Bro" account, inviting users to experience it online. This marketing move confused many people, as many were waiting for major news from OpenAI.
MultiOn claims Agent Q is a breakthrough AI agent, combining technologies such as Monte Carlo Tree Search (MCTS) and self-criticism. It is said to perform 3.4 times better than the LLama 3 baseline zero-shot performance, with a 95.4% success rate in real-world task evaluations.
Agent Q can perform tasks such as booking restaurant seats and flights. However, netizens are not convinced and are more concerned about whether MultiOn is using the "Strawberry Bro" account for hype.
The paper related to Agent Q has been published, with main components including:
- Guided search using MCTS
- AI self-criticism
- Direct Preference Optimization (DPO)
Researchers explored how to give agents additional search capabilities through MCTS, formulating web agent execution as web tree search.
Experimental results show that after applying MCTS, the success rate of the base model increased from 28.6% to 48.4%. After further fine-tuning, Agent Q's performance reached 50.5%, slightly exceeding average human performance.
Although the technical details are intriguing, MultiOn's marketing approach has sparked controversy, with some netizens calling them "shameless frauds."*