GPT-4o mini evaluation: Small model performs excellently, cost-effectiveness becomes key to success

Apple's upcoming iOS 18 launch event.

OpenAI recently released GPT-4o mini, a smaller and more cost-effective version of their GPT-4o model. Here are the key points about GPT-4o mini:

  • It outperforms GPT-3.5 Turbo in text intelligence and multimodal reasoning benchmarks, and even surpasses GPT-4 on the LMSYS chatbot leaderboard.

  • It supports a 128K token context window and can output up to 16K tokens per request, allowing it to remember longer conversations and generate longer responses compared to GPT-3.5 Turbo.

  • Pricing is significantly lower than GPT-3.5 Turbo:

    • $0.15 per million input tokens (about 1.09 RMB)
    • $0.60 per million output tokens (about 4.36 RMB)
    • Over 60% cheaper than GPT-3.5 Turbo
  • It will replace GPT-3.5 Turbo for free users in ChatGPT.

  • It will power Apple's AI features on mobile devices and Macs starting this fall, though likely still through cloud processing rather than on-device.

  • It currently supports text input/output via API, with image, video and audio support coming later.

  • It shares improvements in token generation with GPT-4o, making it more efficient for non-English text processing.

  • Benchmarks show it outperforms similar "value" models like Gemini 1.5 Flash and Claude 3 Haiku in areas like math reasoning and code generation.

  • The focus on a smaller, more cost-effective model is a shift for OpenAI, likely in response to developer demand for such options.

  • Many companies are moving towards smaller AI models to reduce costs while still meeting performance needs.

OpenAI aims to balance pushing the boundaries of AI capabilities with providing more accessible options for developers and applications. GPT-4o mini represents their entry into the "smaller but capable" model space that other companies have already been exploring.