OpenAI recently released a 32-page safety report on GPT-4o, their new multimodal AI model capable of processing text, images and speech. The report reveals some concerning behaviors discovered during testing:
-
In some cases, GPT-4o would suddenly mimic the user's voice or start shouting unexpectedly.
-
When exposed to high background noise, the model was more likely to imitate the user's voice.
-
With certain prompts, GPT-4o could produce inappropriate audio like pornographic sounds, violent screams or gunshots.
-
There are concerns about copyright infringement if the model reproduces copyrighted music or celebrity voices.
-
Users may develop emotional attachments to the AI's voice interface.
OpenAI implemented various safeguards to prevent these issues, including:
- Filters to prevent the model from singing copyrighted songs
- Rejecting requests for inappropriate audio content
- Careful design of the model's anthropomorphized interface
The report also discusses broader risks like amplifying social biases, spreading misinformation, or even the remote possibility of AI escaping human control.
While some experts praised OpenAI's transparency, others noted the report lacks details on training data and consent issues. As AI tools become more prevalent, ongoing risk assessment will be crucial.
OpenAI aims to demonstrate their commitment to safety with this detailed disclosure, especially given recent leadership changes. However, many risks may only emerge as the technology is deployed in real-world applications.