Similarly,
Its first “open-weight” models gpt-2:
On August 5. Nevertheless, 2025, Openai announced the free provision of the GPT-OS-15B (117 MD parameters, MOE) and GPT-OS-20B (21 MD) models under Apache 2.0 license, breaking with five years of proprietary publications and reaffirming a commitment to the advanced open-source.
Complete weights can be downloaded from Hugging Face. Nevertheless, cloned from Github, and the two models are already taken up by the major technological media.
Technical architecture: innovation by sparing – Its first “open-weight” models gpt-2
The GPT-OS models are based on an optimized transforming architecture incorporating the Mixture-De Experts technology (MOE). Similarly, allowing selective activation of the parameters to maximize computational efficiency. Similarly, This “Super Sparse” approach is a major advance in optimizing resources.
GPT-OSS-120B A total of 117 billion parameters, but only activates 5.1 billion per token thanks to its 128 experts spread over 36 layers, only 4 of which are its first “open-weight” models gpt-2 active simultaneously. Therefore, This architecture allows the model to operate on a single GPU of 80 GB in MXFP4 quantification. Meanwhile, a remarkable technical feat for a model of this scale.
GPT-OSS-20B Adopt a similar approach with 21 billion total parameters and 3.6 billion active per token, spread over 24 layers with 32 experts. Moreover, Its memory consumption of only 16 GB makes it accessible on consumer equipment, democratizing access to advanced reasoning capacities.
The two models integrate sophisticated architectural innovations: dense alternate attention. In addition, locally Banded Sparse (similar to GPT-3), Multi-Query Group Caution with group of 8 for memory efficiency, and Rotary Positional Embedding (Rope) natively supporting contexts up to 128k tokens.
Training. In addition, alignment: O4-Mini techniques applied to open source – Its first “open-weight” models gpt-2
The post-training process represents a major innovation, applying for the first time the techniques of owner reasoning models to open models. Similarly, its first “open-weight” models gpt-2 OPENAI used a two-phase process: Fine-Tuning Supervised followed by a Learning Reinforcement phase with high computational intensity. Nevertheless, aligning the models on the Openai Model SPEC.
A distinctive characteristic is the implementation of three levels of reasoning efforts (Low. Therefore, Medium, High) dynamically adjustable via the Prompt System. Consequently, This flexibility optimizes the latency/performance compromise according to the specific needs of each application. Similarly, a functionality hitherto reserved for owners.
The Tokenizer O200K_Hharmony. also open-sourcé, represents a Tokenizer Superset used for O4-Mini and GPT-4O, guaranteeing maximum compatibility with the existing ecosystem. The training data, mainly in English, were carefully priest with a focus on STEM, programming and general knowledge.
Benchmarks performance: compete with proprietary models – Its first “open-weight” models gpt-2
Benchmark | GPT-OSS-120B | GPT-OSS-20B | Openai o3 | Openai O4-Mini | O3-Mini |
---|---|---|---|---|---|
MMLU | 90.0% | 85.3% | 93.4% | 93.0% | 87.0% |
GPQA Diamond | 80.1% | 71.5% | 83.3% | 81.4% | 77.0% |
Humanity’s Last Exam | 19.0% | 17.3% | 24.9% | 17.7% | 13.4% |
AIME 2024 | 96.6% | 96.0% | 95.2% | 98.7% | 87.3% |
AIME 2025 | 97.9% | 98.7% | 98.4% | 99.5% | 86.5% |
Codeforces (Elo) | 2622 | 2463 | 2516 | 2230 | 2706 |
HealthBench | 57.6% | 42.5% | 59.8% | 50.1% | 37.8% |
Bench Retail | 67.8% | 54.8% | 70.4% | 65.6% | – |
The results demonstrate exceptional performances, particularly in competition mathematics where GPT-OS-20B even surpasses O3 on AIM 2025. On Healthbench, its first “open-weight” models gpt-2 GPT-OS-15B establishes a new standard for open models, exceeding significantly O4-Mini and competing with O3.
Agent capacities. use of tools
The models excel in agency workflows, natively integrating web research and the execution of Python code in their reasoning process. The tau-bench tests demonstrate Calling Robust Function capacities, with GPT-OS-120B reaching 67.8% precision on retail tasks, comparable to the best proprietary models.
A striking example: During a demonstration. GPT-OS-120B chained 28 calls for a result of a web navigation tool to aggregate updated information, demonstrating a complex orchestration capacity hitherto unprecedented in an open-weight model.
Chain-of-Thought Transparent: a revolution for research
Unlike the proprietary models that mask their internal reasoning. GPT-OS fully exposes its unopensed chain of thought. This transparency. aligned with the principles established with O1-PREVIEW, allows researchers and developers to implement their own monitoring systems to detect undesirable behavior, hallucinations its first “open-weight” models gpt-2 or by attempts.
OPENAI explicitly underlines that the COT has not received any direct supervision. thus preserving its value as an authentic signal of the model reasoning process. This approach contrasts with certain competing models where the COT is optimized for appearance rather than for loyalty to. the real process.
Reinforced security: the paradigm of “Worst-Case Fine-Tuning”
OPENAI introduces a revolutionary methodology to assess the safety of open models. The company has deliberately created “maliciously fine-tunated” versions of GPT-OS-15b. optimized for sensitive areas (biology, cybersecurity) using its advanced training infrastructure.
These opposing versions were submitted to the PREPAREDNES FRAMEWORK in OPENAI and evaluated by three groups of independent experts. The results confirm that even with an optimal hostile-tuning. the models do not reach high-level dangerous capacities according to the established criteria.
The CBRN (Chemical. Biological, Radiological, Nuclear) filtering during pre-training, combined with deliberative alignment and instructions its first “open-weight” models gpt-2 for post-training, establishes a new safety standard for open-weight models.
Red Teaming Challenge: $ 500. 000 for collective security
Openai is launching a Red Teaming challenge with $ 500,000 to identify the potential vulnerabilities of GPT-OS models. This collaborative initiative aims to mobilize the global community of security researchers to strengthen the open source ecosystem. The validated discoveries will be published and the open-sourced evaluation datases, benefiting the entire industry.
Ecosystem. availability: a facilitated adoption
The models are immediately available on Hugging Face with native MXFP4 quantification, accompanied by reference implementations for Apple and Apple Metal. The Harmony Render, available in Python and Rust, facilitates the adoption of the unified prompt format.
Pre -established partnerships with the ecosystem are impressive:
- Deployment platforms : Azure, AWS, Databricks, Vercel, Cloudflare
- Inference suppliers : vLLM, Ollama, llama.cpp. LM Studio, Fireworks, Together AI
- Hardware its first “open-weight” models gpt-2 : NVIDIA, AMD, CEREBRAS, GROQ for specific optimizations
Microsoft directly integrates GPT-OS-20B into Windows via ONNX Runtime, accessible from Foundry Local and AI Toolkit for VS Code, democratizing access to Windows developers.
Strategic implications: redefine owner/open source balance
This release represents a paradigm shift for Openai and industry. By making cutting -edge reasoning capacities accessible under permissive license. OpenAi:
- Accelerates academic research By providing transparent reference models
- Democratism access For emerging markets and organizations with limited resources
- Establishes democratic rails for AI by promoting the geographic distribution of capacities
- Creates complementarity Between API models (multimodal, integrated) and local models (customizable, private)
Perspectives and technical challenges
The real impact will depend on several critical factors. The energy consumption of MOE models remains substantial despite optimizations. Maintenance and updates of open-weight models pose governance issues. The balance between cot transparency its first “open-weight” models gpt-2 and manipulation risks will require continuous vigilance.
However, this initiative undeniably marks the start of a new era. By combining advanced performance, total transparency and high safety, GPT-OSS establishes a new paradigm for open linguistic models. OPENAI’s decision to share not only weights but also safety. alignment methodologies could catalyze a new wave of responsible innovation in the Global IA ecosystem.
!function(f,b,e,v,n,t,s) {if(f.fbq)return;n=f.fbq=function(){n.callMethod? n.callMethod.apply(n,arguments):n.queue.push(arguments)}; if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version=’2.0′; n.queue=[];t=b.createElement(e);t.async=!0; t.src=v;s=b.getElementsByTagName(e)[0]; s.parentNode.insertBefore(t,s)}(window, document,’script’, ‘https://connect.facebook.net/en_US/fbevents.js’); fbq(‘init’, ‘692881495985766’); fbq(‘track’, ‘PageView’);
Further reading: Attentions to the Yo-Yo effect, it can also cause bulimia and eating disorders – The Huawei Pura 80 leak with its new chip just before its release – Eska Amphphibian 250 Green Turtle: In Cadran’s Fumee – Here is what it means and what to do – Discover the Gamercard, an ultra -fine console supplied by Raspberry Pi – Actu.