当前位置：首页 > news >正文

本地部署Qwen2.5-VL-7B-Instruct模型

news 2025/9/4 12:28:17

本地部署Qwen2.5-VL-7B-Instruct模型

本地部署Permalink

**创建环境**
conda create -n qwenvl python=3.11 -y# 报错：
Solving environment: failedPackagesNotFoundError: The following packages are not available from current channels:# 处理：
conda config --append channels conda-forge# 激活环境 (注意这里的语法)
conda activate qwenvlyum -y install git

安装依赖


pip install vllm
报错：
error: subprocess-exited-with-error× Preparing metadata (pyproject.toml) did not run successfully.│ exit code: 1╰─> [6 lines of output]Cargo, the Rust package manager, is not installed or is not on PATH.This package requires Rust and Cargo to compile extensions. Install it throughthe system's package manager or via https://rustup.rs/Checking for Rust toolchain....[end of output]note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed× Encountered error while generating package metadata.
╰─> See above for output.note: This is an issue with the package mentioned above, not pip.
hint: See above for details.处理：
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
或者使用如下方法
# 安装
(qwenvl) [root@localhost qwenvl]# rustc --version
bash: rustc: command not found...
Install package 'rust' to provide command 'rustc'? [N/y] y* Waiting in queue... 
The following packages have to be installed:rust-1.66.1-1.el9.x86_64	The Rust Programming Languagerust-std-static-1.66.1-1.el9.x86_64	Standard library for Rust
Proceed with changes? [N/y] y* Waiting in queue... * Waiting for authentication... * Waiting in queue... * Requesting data... * Testing changes... * Installing packages... 
rustc 1.66.1 (90743e729 2023-01-10) (Red Hat 1.66.1-1.el9)# 安装
(qwenvl) [root@localhost qwenvl]# cargo --version
bash: cargo: command not found...
Install package 'cargo' to provide command 'cargo'? [N/y] y* Waiting in queue... * Loading list of packages.... 
The following packages have to be installed:cargo-1.66.1-1.el9.x86_64	Rust's package manager and build tool
Proceed with changes? [N/y] y* Waiting in queue... * Waiting for authentication... * Waiting in queue... * Requesting data... * Testing changes... * Installing packages... 
cargo 1.66.1

继续安装

pip install vllm
pip install git+https://github.com/huggingface/transformers accelerate# 如果 transformers 拉取失败，手动下载代码并安装
访问 https://github.com/huggingface/transformers
点击 Code > Download ZIP，下载 ZIP 文件。
解压 ZIP 文件并进入目录：unzip transformers-main.zip
cd transformers-main
本地安装：
pip install .pip install torch
pip install flash-attn --no-build-isolation
pip install "huggingface_hub[hf_transfer]"
pip install modelscope
pip install qwen_vl_utils

拉取模型

cd /data/
mkdir -p qwen2.5/Qwen2.5-VL-7B-Instruct #创建模型文件夹 
cd qwen2.5/ #进到数据盘目录#下载模型到指定文件夹
modelscope download --model Qwen/Qwen2.5-VL-7B-Instruct --local_dir ./Qwen2.5-VL-7B-Instruct # local_dir后是下载到指定文件夹

测试脚本

vim vl_demo.pyfrom transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch# 清空CUDA缓存，以释放内存
torch.cuda.empty_cache()# 指定使用第3块显卡 (如果显卡可用)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")# 加载模型到指定的设备
# 指定模型的位置
model = Qwen2_5_VLForConditionalGeneration.from_pretrained("/data/qwen2.5/Qwen2.5-VL-7B-Instruct", torch_dtype="auto"  # 自动选择精度
)# 将模型迁移到指定设备（如GPU或CPU）
model.to(device)# 加载处理器，用于处理输入的图像和文本
processor = AutoProcessor.from_pretrained("/data/qwen2.5/Qwen2.5-VL-7B-Instruct")# 准备消息和输入内容
messages = [{"role": "user",  # 用户角色"content": [{"type": "image",  # 图像类型# "image": "/opt/src/vl_demo1/20250219101931.jpg",  # 图像路径（本地路径）"image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",  # 图像路径（本地路径）"max_pixels": 512 * 512,  # 图像最大尺寸限制},# 这段代码注释掉了文本输入，如果需要可以打开#{"type": "text", "text": "Describe this image."},# {"type": "text", "text": "请认真阅读图像内容后，以HTML结构化输出。"},  # 文本输入，要求生成HTML格式的描述{"type": "text", "text": "图片表达的是什么意思,以中文结果输出。"},  # 文本输入，要求生成HTML格式的描述],}
]# 准备输入文本，用于推理
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True  # 准备聊天模板并添加生成提示
)# 处理图像和视频输入
image_inputs, video_inputs = process_vision_info(messages)# 使用处理器准备所有输入
inputs = processor(text=[text],images=image_inputs,videos=video_inputs,padding=True,return_tensors="pt",  # 将输入转换为 PyTorch 张量
)# 将所有输入迁移到与模型相同的设备
inputs = inputs.to(device)# 进行推理：生成模型的输出
generated_ids = model.generate(**inputs, max_new_tokens=4096)  # 最大生成 4096 个新token
generated_ids_trimmed = [out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]# 解码生成的输出并打印
output_text = processor.batch_decode(generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)  # 输出结果