当前位置：首页 > news >正文

Llama 3 超级课堂 -笔记

news 2025/8/13 22:40:24

课程文档： https://github.com/SmartFlowAI/Llama3-Tutorial

课程视频：https://space.bilibili.com/3546636263360696/channel/series

1 环境配置

1.1 创建虚拟环境,名为：llama3

conda create -n llama3 python=3.10

1.2 下载、安装 pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1

conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia

1.3 通过软连接获取 Meta-Llama-3-8B-Instruct模型

ln -s /root/share/new_models/meta-llama/Meta-Llama-3-8B-Instruct ~/model/Meta-Llama-3-8B-Instruct

1.4 获取Xtuner微调工具

cd ~
git clone -b v0.1.18 https://github.com/InternLM/XTuner
cd XTuner
pip install -e .

2 Llama 3 Web Demo 部署

3 XTuner 完成小助手认知微调

3.1 自我认知训练数据集准备

cd ~/Llama3-Tutorial
python tools/gdata.py

以上脚本在生成了 ~/Llama3-Tutorial/data/personal_assistant.json 数据文件格式如下所示：

训练模型

xtuner train configs/assistant/llama3_8b_instruct_qlora_assistant.py --work-dir /root/llama3_pth

Adapter PTH 转 HF 格式

xtuner convert pth_to_hf /root/llama3_pth/llama3_8b_instruct_qlora_assistant.py \/root/llama3_pth/iter_500.pth \/root/llama3_hf_adapter

模型合并

export MKL_SERVICE_FORCE_INTEL=1
xtuner convert merge /root/model/Meta-Llama-3-8B-Instruct \/root/llama3_hf_adapter\/root/llama3_hf_merged

模型推理

streamlit run ~/Llama3-Tutorial/tools/internstudio_web_demo.py \/root/llama3_hf_merged

4 Llama 3 图片理解能力微调

获取 Llama3 权重、Visual Encoder 权重、 Image Projector 权重

由上图报错，deepspeed未安装，所以通过 pip install deepspeed。以及也要需要安装 mpi4py

使用pip install mpi4py时，报如下错误出错，解决方法，见：https://blog.csdn.net/weixin_51762856/article/details/134247764

由于显存有限，无法进行模型训练了

5 Llama 3 高效部署实践

安装lmdeploy最新版

直接使用lmdeploy进行推理，显存占有：36G左右

推理结果：

把--cache-max-entry-count参数设置为0.5 ，显存占有：28G左右

把--cache-max-entry-count参数设置为0.01，显存占16G左右

使用W4A16量化

lmdeploy lite auto_awq \/root/model/Meta-Llama-3-8B-Instruct \--calib-dataset 'ptb' \--calib-samples 128 \--calib-seqlen 1024 \--w-bits 4 \--w-group-size 128 \--work-dir /root/model/Meta-Llama-3-8B-Instruct_4bit

使用Chat功能运行W4A16量化后的模型。

启动API服务器

lmdeploy serve api_server \/root/model/Meta-Llama-3-8B-Instruct \--model-format hf \--quant-policy 0 \--server-name 0.0.0.0 \--server-port 23333 \--tp 1

本地需要ssh转发

命令行客户端连接API服务器

网页客户端连接API服务器

pip install gradio==3.50.2
lmdeploy serve gradio http://localhost:23333 \--server-name 0.0.0.0 \--server-port 6006

查看全文

http://www.lryc.cn/news/349378.html

Leetcode 第 129 场双周赛题解

队列的讲解

算法学习笔记（LCA）

记一次苹果appstore提审拒审问题1.2

在做题中学习（59）：除自身以为数组的乘积

centos 把nginx更新到最新版本

01.认识HTML及常用标签

从零开始：C++ String类的模拟实现

银河麒麟服务器操作系统V10-SP2部署gitlab服务

【计算机毕业设计】基于SSM+Vue的线上旅行信息管理系统【源码+lw+部署文档+讲解】

链表CPP简单示例

智能EDM邮件群发工具哪个好？

低代码与AI技术发展：开启数字化新时代

风电功率预测 | 基于遗传算法优化BP神经网络实现风电功率预测（附matlab完整源码)

uni-segmented-control插件使用

被动防护不如主动出击

ollama离线部署llama3（window系统）

基于Django实现的（bert）深度学习文本相似度检测系统设计

数据中心网络随想-电路交换

并行执行线程资源管理方式——《OceanBase 并行执行》系列 3

数据库系统概论（个人笔记）（第二部分）

WebView基础知识以及Androidx-WebKit的使用

解锁AI写作新纪元的文心一言指令

前端学习——工具的使用

图的拓扑序列（BFS_如果节点带着入度信息）

Linux常用指令集合

前端 JS 经典：为什么需要模块化

MySQL：某字段追加随机数

研发管理-选择研发管理系统-研发管理系统哪个好

学校NTP时钟系统（时间同步系统）方案助力建设智慧校园

1 环境配置

2 Llama 3 Web Demo 部署

3 XTuner 完成小助手认知微调

4 Llama 3 图片理解能力微调

5 Llama 3 高效部署实践

相关文章：