GLM-4.5V
A frontier vision language model by Z AI
Deploy GLM-4.5V behind an API endpoint in seconds.
Deploy modelExample usage
GLM 4.5V is a SOTA 106 billion parameter LLM with excellent vision and frontend coding capabilities. You can deploy GLM 4.5V on NVIDIA H100 GPUs with Baseten today.
✕
Deployments of GLM-4.5V are OpenAI-compatible.
Input
1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7 api_key=os.environ['BASETEN_API_KEY'],
8 base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13 model="zai-org/GLM-4.5V",
14 stream=True,
15 messages=[
16 {"role": "system", "content": "You are a helpful vision-language assistant."},
17 {"role": "user", "content": [{"url": "https://upload.wikimedia.org/wikipedia/commons/f/fa/Grayscale_8bits_palette_sample_image.png", "type": "image"},
18 {"text": "Describe this image in detail.", "type": "text"}
19 ],
20 max_tokens=1024
21 temperature=0.7}
22print(response_chat)
JSON output
1{
2 "id": "143",
3 "choices": [
4 {
5 "finish_reason": "stop",
6 "index": 0,
7 "logprobs": null,
8 "message": {
9 "content": "[Model output here]",
10 "role": "assistant",
11 "audio": null,
12 "function_call": null,
13 "tool_calls": null
14 }
15 }
16 ],
17 "created": 1741224586,
18 "model": "",
19 "object": "chat.completion",
20 "service_tier": null,
21 "system_fingerprint": null,
22 "usage": {
23 "completion_tokens": 145,
24 "prompt_tokens": 38,
25 "total_tokens": 183,
26 "completion_tokens_details": null,
27 "prompt_tokens_details": null
28 }
29}