海光GPU SCNet环境部署

发表于 2026-02-09 更新于 2026-04-20 分类于 ai

海光GPU SCNet环境部署

前言

AI模型管理

https://www.scnet.cn/ui/console/index.html#/model-management/list

镜像管理

https://www.scnet.cn/ui/console/index.html#/container-service/my-image

模型部署页面

https://www.scnet.cn/ui/console/index.html#/model-deploy

准备工作

创建目录

1	mkdir -p /public/home/hnxhly/tools/z-ai-docker/ai-server-common

环境测试代码

验证 PyTorch 是否识别海光 GPU

/public/home/hnxhly/tools/z-ai-docker/ai-server-common/cuda_test.py

import torch
print(torch.__version__)
print(torch.cuda.is_available()) 
print(torch.cuda.device_count())
print(torch.cuda.get_device_name(0))

注意：

在 ROCm 中，torch.cuda 接口仍然可用（AMD/海光做了 CUDA API 兼容层），所以代码无需修改。

代码路径为

1	/public/home/hnxhly/tools/z-ai-docker/ai-server-common

下载模型

环境变量

创建目录

1 2	mkdir -p /public/home/hnxhly/tools/modelscope chmod 755 /public/home/hnxhly/tools/modelscope

添加环境变量

1	vim ~/.bashrc

添加

1 2	# ModelScope cache directory export MODELSCOPE_CACHE="/public/home/hnxhly/tools/modelscope"

重新加载配置

1	source ~/.bashrc

验证

1 2	echo $MODELSCOPE_CACHE # 输出应为：/public/home/hnxhly/tools/modelscope

下载

Qwen2.5-VL-32B-Instruct

千问2.5-VL-32B-Instruct · 模型库

下载模型

1	modelscope download --model Qwen/Qwen3-8B

模型下载位置为

1	/data/tools/modelscope

模型复制

平台上下载的模型位置

1	~/SothisAI/model/Aihub/Qwen3-8B/main/Qwen3-8B

移动到环境变量位置

1 2	mkdir -p $MODELSCOPE_CACHE/models/Qwen/Qwen3-8B mv ~/SothisAI/model/Aihub/Qwen3-8B/main/Qwen3-8B $MODELSCOPE_CACHE/models/Qwen/

1 2	mkdir -p $MODELSCOPE_CACHE/models/Qwen/Qwen2.5-VL-32B-Instruct mv ~/SothisAI/model/Aihub/Qwen2.5-VL-32B-Instruct/main/Qwen2.5-VL-32B-Instruct $MODELSCOPE_CACHE/models/Qwen/

启动设置

文件挂载

modelscope下载目录 ➡ /data/tools/modelscope
项目目录 ➡ /app

Docker

Dockerfile

/public/home/hnxhly/tools/z-ai-docker/ai-server-common/Dockerfile

FROM image.sourcefind.cn:5000/dcu/admin/base/vllm:0.8.5-ubuntu22.04-dtk25.04.1-rc5-das1.6-py3.10-20250724

RUN pip install modelscope
ENV MODELSCOPE_CACHE="/data/tools/modelscope"

RUN pip install qwen-vl-utils[decord]==0.0.8

测试

{
  "model": "Qwen/Qwen3-8B",
  "messages": [
    {
      "role": "user",
      "content": "Java的基本语法"
    }
  ],
  "stream": true,
  "chat_template_kwargs": {
    "enable_thinking": false
  }
}

CURL测试

curl 'https://public-2021127773854605315-iaab.zzai2.scnet.cn:58041/v1/chat/completions'  \
  -H "Content-Type: application/json" \
  -d '{
        "model": "Qwen/Qwen3-8B",
        "messages": [
          {"role": "system", "content": "You are a helpful assistant."},
          {"role": "user", "content": "Java的基本语法"}
        ],
        "stream": true,
        "chat_template_kwargs": {
            "enable_thinking": false
            }
      }'

前言

相关网站

准备工作

环境测试代码

下载模型

环境变量

下载

模型复制

启动设置

Docker

Dockerfile

测试