AiNews.com
Posts
OpenAI's GPT-4o Model Deemed 'Medium' Risk in New System Card Release

OpenAI's GPT-4o Model Deemed 'Medium' Risk in New System Card Release

Alicia Shapiro
August 09, 2024 • Estimated Reading Time: 6 minutes

A detailed visual representation of the GPT-4o System Card, featuring a high-tech digital interface with risk assessment charts, security shields, and icons symbolizing AI technology and evaluation processes. The background showcases a complex, interconnected system with nodes and data streams, reflecting the technical and analytical nature of the document. The overall design uses a sleek and professional color scheme of muted blues and greys, emphasizing the seriousness and importance of the information presented

OpenAI's GPT-4o Model Deemed 'Medium' Risk in New System Card Release

OpenAI has published the GPT-4o System Card, a comprehensive research document outlining the safety measures and risk assessments conducted before the public release of GPT-4o in May. This report provides insight into the evaluations and mitigations applied to ensure the model's safe deployment.

Pre-Release Risk Assessment

Before GPT-4o's launch, OpenAI enlisted external red teamers—security experts tasked with identifying potential vulnerabilities—to assess risks such as unauthorized voice cloning, inappropriate content generation, and reproduction of copyrighted materials. These evaluations were crucial in determining the model's overall risk level. The findings from these evaluations are now publicly available.

Key Findings and Risk Classification

The GPT-4o System Card reveals that the model was classified as "medium" risk overall. This classification stems from the highest risk rating among four categories: cybersecurity, biological threats, persuasion, and model autonomy. While most categories were deemed low risk, the persuasion category was flagged as borderline medium due to a few of GPT-4o's writing samples were potentially more effective at swaying opinions than human-written content.

Image of the GPT-4o Scorecard summarizing key areas of risk evaluation and mitigation, including unauthorized voice generation, speaker identification, ungrounded inference and sensitive trait attribution, generating disallowed audio content, and generating erotic and violent speech. The scorecard also displays the Preparedness Framework ratings, showing low risk for cybersecurity, biological threats, and model autonomy, with a medium risk rating for persuasion.

Image Source: OpenAI

External Evaluations and Transparency Efforts

Lindsay McCallum Rémy, an OpenAI spokesperson, explained that the system card includes evaluations conducted by both internal teams and external testers, including groups like Model Evaluation and Threat Research (METR) and Apollo Research. These organizations specialize in evaluating AI systems and contributed to the assessment process.

Summary of the GPT-4o System Card

The GPT-4o System Card highlights the extensive safety work carried out, including:

Safety Evaluations: The system card details various safety evaluations conducted on GPT-4o, focusing on its text, vision, and especially its audio capabilities, which present novel risks. Evaluations included tests for speaker identification, unauthorized voice generation, and the generation of copyrighted content.
Preparedness Framework: The report includes Preparedness Framework evaluations, which assessed GPT-4o across four risk categories—cybersecurity, CBRN (chemical, biological, radiological, nuclear), persuasion, and model autonomy. The model was classified as low risk in all categories except persuasion.
Audio Capabilities: Special attention was given to the model’s audio capabilities, including the risks of unauthorized voice generation and speaker identification. Mitigations were put in place to limit the use of unauthorized voices and ensure that the model only complies with requests to identify well-known public figures based on voice input.
Voice Modality and Safety Behavior: GPT-4o’s ability to handle audio inputs was evaluated for potential risks such as bias, ungrounded inference, and generating disallowed content. These evaluations were key to ensuring that the model’s voice outputs are consistent with safety protocols.
Red Teaming and External Evaluations: OpenAI worked with over 100 external red teamers from 29 different countries to assess the model's performance and safety across various scenarios. These efforts included testing the model’s ability to handle different accents, voices, and audio conditions.

Context and Timing of the Release

This is not the first system card released by OpenAI; previous documents were published for GPT-4, GPT-4 with vision, and DALL-E 3. However, the release of the GPT-4o System Card comes at a particularly critical time for the company, as the company faces increased scrutiny regarding its safety standards. This scrutiny has come from employees, stakeholders, lawmakers, and the public, particularly in light of the upcoming U.S. presidential election, which raises concerns about the potential misuse of AI models like GPT-4o.

Public and Legislative Concerns

The timing of the release coincides with heightened public and legislative concerns, especially regarding the model’s training data and safety testing. Just before the system card was published, an exclusive report by The Verge revealed an open letter from Sen. Elizabeth Warren (D-MA) and Rep. Lori Trahan (D-MA), demanding answers about OpenAI's handling of whistleblowers and safety reviews. This letter highlighted various safety concerns within the company, including the brief ousting of CEO Sam Altman in 2023 and the resignation of a safety executive who claimed that "safety culture and processes have taken a backseat to shiny products."

Looking Ahead: Legislation and Accountability

There have been increasing calls for greater transparency from OpenAI, especially regarding the model’s training data and safety testing procedures. In California, state Sen. Scott Wiener is pushing for legislation to regulate large language models, which would hold companies legally accountable if their AI is used in harmful ways. If this bill passes, OpenAI’s future models will need to comply with state-mandated risk assessments before being released to the public.

Conclusion

The GPT-4o System Card provides valuable insights with a detailed look at OpenAI’s efforts to ensure the safe deployment of its latest AI model, but it also underscores the importance of continued scrutiny and transparency. While the involvement of external testers and thorough evaluations are positive steps, much of the responsibility still falls on OpenAI to self-assess and mitigate risks. As AI technology continues to evolve, ongoing scrutiny and transparency will be essential to maintaining public trust and ensuring that these powerful tools are used responsibly. For more details on the score card, please visit their website.