Negative
29Serious
Neutral
Optimistic
Positive
- Total News Sources
 - 1
 - Left
 - 0
 - Center
 - 1
 - Right
 - 0
 - Unrated
 - 0
 - Last Updated
 - 3 days ago
 - Bias Distribution
 - 100% Center
 


OpenAI Releases Open-Weight Safety Models for Custom Online Moderation
OpenAI has launched two new open-weight reasoning models, gpt-oss-safeguard-120b and gpt-oss-safeguard-20b, designed to allow developers to apply custom safety policies dynamically at inference time rather than embedding fixed policies during training. These models, fine-tuned from the gpt-oss base and licensed under Apache 2.0, enable platforms to rapidly adapt safety measures to emerging risks such as fraud, self-harm, or game-specific abuse by interpreting developer-provided policy rules and providing an auditable chain of reasoning. The 120B parameter model fits on an 80GB GPU, while the 20B model is optimized for smaller GPUs with 16GB VRAM, making them accessible for varied deployment environments. OpenAI emphasized that these models outperform GPT-5 on internal multi-policy benchmarks and are part of a broader "defense-in-depth" safety strategy already used internally, dedicating up to 16% of compute resources to safety reasoning. The release is in collaboration with partners including ROOST, Discord, and SafetyKit, with ROOST also launching a community hub to support the development of open-source safety infrastructure for smaller platforms. This move addresses concerns about AI safety and ethics, offering transparency and control to developers while supporting rapid safety policy updates without retraining large models.

- Total News Sources
 - 1
 - Left
 - 0
 - Center
 - 1
 - Right
 - 0
 - Unrated
 - 0
 - Last Updated
 - 3 days ago
 - Bias Distribution
 - 100% Center
 
Negative
29Serious
Neutral
Optimistic
Positive
Related Topics
Stay in the know
Get the latest news, exclusive insights, and curated content delivered straight to your inbox.

Gift Subscriptions
The perfect gift for understanding
 news from all angles.

