AiNews.com
Posts
OpenAI Updates Safety Framework, Signals Flexibility Amid AI Race

OpenAI Updates Safety Framework, Signals Flexibility Amid AI Race

Alicia Shapiro
April 16, 2025 • Estimated Reading Time: 6 minutes

A person seated at a modern desk, focused intently on a computer screen displaying a minimalist AI risk evaluation interface. The screen shows two labeled sections: “High Capability” and “Critical Capability,” with a warning icon next to an “Adjust Safeguards” button. The individual appears thoughtful and analytical, reflecting the gravity of evaluating AI safety thresholds. The workspace is calm and professional, with soft lighting and clean surroundings. The interface uses a dark mode layout with subtle blue and red highlights, suggesting structured oversight and responsible governance.

Image Source: ChatGPT-4o

OpenAI Updates Safety Framework, Signals Flexibility Amid AI Race

OpenAI has released a major update to its Preparedness Framework, outlining how the company evaluates and manages risks associated with frontier AI systems. The updated framework sharpens how OpenAI tracks potentially dangerous capabilities, clarifies what it means to “sufficiently minimize” risk, and introduces new procedures for safety reporting and governance.

But one provision is drawing particular attention: OpenAI now says it may adjust its own safety requirements if other developers release high-risk AI systems without comparable safeguards—a move that reflects growing competitive pressure in the AI arms race.

What’s New in the Framework

The Preparedness Framework is OpenAI’s internal system for identifying, evaluating, and mitigating risks tied to emerging AI capabilities. Key updates include:

OpenAI now focuses on five key indicators when evaluating whether a capability requires safeguards: risk plausibility, risk measurability, severity of the risk, how unprecedented the risk is, and its immediacy.
Tracked vs. Research Categories: OpenAI now splits AI capabilities into two tiers. Tracked Categories include well-established risk areas where the company already has strong evaluations and safeguards in place: biological and chemical security, cybersecurity, and AI self-improvement. These categories are actively monitored due to their dual-use nature—offering both high potential and high risk.
Meanwhile, Research Categories cover emerging areas that may pose severe harm but aren’t yet fully understood or measurable. These include:
Long-range autonomy (AI acting independently over time or distance)
Sandbagging (intentional underperformance by AI)
Autonomous replication and adaptation (AI systems evolving on their own)
Undermining safeguards
Nuclear and radiological risks (These are being actively studied through new threat models and evaluation techniques.)
Updated Capability Levels - OpenAI now uses two simplified thresholds to categorize AI capabilities:
High Capability: Systems that could amplify existing severe risks
Critical Capability: Systems that could introduce entirely new, unprecedented risks
High-capability systems must have strong safeguards before deployment, while Critical ones require safeguards even during development. This shift reflects a more proactive approach to risk mitigation—especially for advanced models with unpredictable behaviors.
New Safeguards Reports: In addition to publishing Capabilities Reports (formerly called the Preparedness Scorecard), OpenAI will now produce Safeguards Reports that detail how protections are designed, implemented, and evaluated. These reports guide decisions around deployment and reflect OpenAI’s commitment to a “defense-in-depth” strategy, ensuring safety measures are layered and resilient. The Safety Advisory Group (SAG) reviews these reports before making recommendations to leadership.
Scalable Evaluations: As AI model updates become more frequent—and less reliant on major new training runs—OpenAI is expanding its suite of automated evaluations. These tools allow the company to track performance and risks at scale, without sacrificing depth. Expert-led “deep dives” continue to complement automation, ensuring that nuanced or emergent risks are not overlooked.

The Competitive Clause: Adjusting Safety Standards

One of the most consequential additions appears in a section titled “Responding to shifts in the frontier landscape.” OpenAI writes:

“If another frontier AI developer releases a high-risk system without comparable safeguards, we may adjust our requirements. However, we would first rigorously confirm that the risk landscape has actually changed...”

In short, OpenAI may revise its requirements if the broader AI landscape shifts dramatically by:

Confirming that the risk landscape has actually changed
Publicly disclose any adjustment
Ensure changes do not meaningfully increase the overall risk of severe harm
Maintain safeguards that remain more protective than the industry norm

The clause reflects a difficult balancing act: maintaining high safety standards while remaining agile in a rapidly advancing, highly competitive AI environment.

Ongoing Transparency and Safety Governance

OpenAI says its Safety Advisory Group (SAG), made up of internal safety leaders, will continue to review all safeguard and capability reports. This team can recommend deployment, request stronger protections, or delay rollout if necessary. Leadership retains final decision-making power but is expected to act on SAG's guidance.

The company reaffirmed its commitment to transparency, noting it will continue publishing Preparedness findings for each major model, including GPT‑4.5, GPT‑4o, and beyond.

What This Means

OpenAI’s updated framework shows a company still striving to set the pace for AI safety—tightening risk thresholds, formalizing governance, and expanding its evaluation toolkit. But the inclusion of a competitive response clause marks a subtle but important shift: safety, while still paramount, must now flex in a world where not every player may follow suit.

For researchers and regulators, this clause could signal a turning point in how AI labs calibrate caution and ambition.

But if safety standards become contingent on what others choose to do, we risk building ethics by consensus—rather than conviction. In a field moving this fast, leadership isn’t just about pace—it’s about principle.

The frontier is no longer defined by what AI can do—it’s now shaped by how fast, and how carefully, it’s done.

Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.