zai-org/GLM-4.7
A state-of-the-art text generation model with 358B parameters, supporting English and Chinese, optimized for agentic reasoning, coding, and complex tool use.

What is GLM-4.7?
GLM-4.7 is an advanced large language model (LLM) designed to be a "coding partner" and an engine for autonomous agents. It belongs to the GLM-4 series and focuses on "Agentic, Reasoning, and Coding" (ARC) capabilities, moving toward more seamless integration into real-world tasks through features like "thinking before acting".
Key Features
-
Core Coding: Provides significant gains in multilingual agentic coding and terminal-based tasks, outperforming predecessors on benchmarks like SWE-bench (73.8%) and Terminal Bench 2.0 (41.0%).
-
Interleaved & Preserved Thinking:
-
Interleaved Thinking: The model reasons before every response or tool call to improve instruction following.
-
Preserved Thinking: In multi-turn coding scenarios, the model retains thinking blocks across conversations to maintain consistency and reduce information loss.
-
Vibe Coding: Improved UI quality for generating cleaner, modern webpages and accurate slide layouts.
-
Complex Reasoning: Enhanced mathematical and reasoning skills, achieving a 42.8% score on the Humanity’s Last Exam (HLE) benchmark with tools.
-
Tool Using: Significant performance boosts in web browsing (BrowseComp) and complex tool-calling environments (-Bench).
Use Cases
- Autonomous Software Engineering: Solving real-world GitHub issues and performing terminal-based tasks.
- Agentic Workflows: Powering agent frameworks like Claude Code, Cline, and Roo Code where "thinking" and planning are required.
- Front-end & Document Design: Creating functional UI components and formatted presentations.
- High-Level Research & Math: Utilizing specialized reasoning for complex academic or technical problem-solving.
FAQ & Technical Details
- Can I serve it locally? Yes, it supports deployment via vLLM, SGLang, and Transformers.
- What are the recommended settings? For most tasks, use
temperature: 1.0andtop-p: 0.95. For agentic tasks, it is highly recommended to enable Preserved Thinking mode. - What is the model size? The model features 358B parameters and uses the
BF16tensor type. - Is there an API? Yes, API services are available on the Z.ai API Platform.
