GLM-5

Summary

GLM-5 is Z.ai's MIT-licensed 744B-parameter MoE frontier model with 40B active per token, released February 11, 2026 — the first frontier-tier model trained end-to-end on Huawei Ascend silicon. It scores 77.8% on SWE-bench Verified at API pricing roughly 5–8x cheaper than Claude Opus 4.6.

Overview

GLM-5 is the flagship February 2026 frontier-class large language model from [[China/Z.ai|Z.ai]] (formerly Zhipu AI), and the first frontier-tier model trained end-to-end on Chinese-domestic Huawei Ascend silicon. Released February 11, 2026 under the MIT license, GLM-5 is a 744-billion-parameter Mixture-of-Experts model with 40 billion active parameters per token. Z.ai claims GLM-5 matches Claude Opus 4.5 and GPT-5.2 on coding and agent tasks, scoring 77.8% on SWE-bench Verified, 92.7% on AIME 2026, and 86.0% on GPQA-Diamond, while leading open-source models on BrowseComp, Vending Bench 2, and MCP-Atlas.

The release matters as much for what it proves about Chinese hardware as for its benchmark scores: GLM-5 was trained entirely on Huawei Ascend chips with the MindSpore framework, and inference runs across a fully domestic stack including Moore Threads, Cambricon, and Kunlunxin chips. Combined with API pricing of roughly $1.00 per million input tokens and $3.20 per million output tokens — approximately 5x cheaper than Claude Opus 4.6 on input and ~8x cheaper on output — GLM-5 sets a new floor for capable open-weight models worldwide.

Specifications

Developer: [[China/Z.ai|Z.ai]] (formerly Zhipu AI)
Release Date: February 11, 2026
Type: Text generation; agentic; multilingual; code; reasoning
Architecture: Mixture-of-Experts (MoE)
Total Parameters: 744B
Active Parameters Per Token: 40B
Training Hardware: Huawei Ascend (with MindSpore framework)
Inference Hardware: Huawei Ascend, Moore Threads, Cambricon, Kunlunxin (domestic Chinese stack)
License: MIT (open-weight)
API Pricing: ~$1.00 per million input tokens / $3.20 per million output tokens
Variants: GLM-5 (full), GLM-5-Turbo (faster / cheaper inference)

Capabilities

Coding & SWE-bench Verified: 77.8% on SWE-bench Verified, in the same band as Claude Opus 4.5 and GPT-5.2 — the headline coding benchmark for production agentic software engineering.

Math Reasoning: 92.7% on AIME 2026, putting GLM-5 at the leading edge of contest-math performance among open-weight models.

Scientific Reasoning: 86.0% on GPQA-Diamond — the most demanding multiple-choice graduate-level science benchmark.

Agentic Workflows: Leads open-source models on BrowseComp (browser-driven task completion), Vending Bench 2 (multi-step business operations), and MCP-Atlas (Model Context Protocol agent benchmarks).

Multilinguality: Strong Chinese-language performance reflecting Z.ai's primary market, with competitive English-language results on standard evals.

Open-Weight Availability: Full weights released under MIT license, allowing self-hosted deployment without commercial-license restrictions.

Limitations

GLM-5's benchmark claims are self-reported by Z.ai; while early independent replications support the core claims, comprehensive third-party evaluation is still ongoing as of May 2026. The 744B-parameter total footprint requires substantial GPU memory for self-hosted inference even with the 40B active-parameter MoE design — most enterprise users will continue to rely on the API. Geopolitically, U.S. enterprise customers may face procurement-policy or export-control complications when adopting GLM-5 given Z.ai's Chinese national-champion positioning and use of Huawei training hardware. Cybersecurity and content-safety alignment is calibrated to Chinese regulatory standards, which differ in some areas from Western norms.

Recent Developments

GLM-5.1 Released (Late March 2026 / Open-Weight April 8, 2026): Iterative upgrade over GLM-5 with improvements across reasoning, agentic, and multilingual benchmarks; released to subscribers in late March and open-sourced April 8, 2026.
Hong Kong IPO Backdrop (January 8, 2026): Z.ai's $558M Hong Kong IPO at a $7.1B valuation — the world's first publicly listed foundation-model company — provided capital markets context for the GLM-5 launch one month later.
Domestic Hardware Stack Validation: GLM-5's training-and-inference stack (Huawei Ascend + MindSpore + Moore Threads / Cambricon / Kunlunxin) is the strongest empirical answer yet to U.S. export controls and is widely cited in Chinese policy as evidence of chip independence.

Last Updated

May 8, 2026

→ Back to Models