当前位置: 首页 > news >正文

deploy local llm ragflow

CPU >= 4 cores
RAM >= 16 GB
Disk >= 50 GB
Docker >= 24.0.0 & Docker Compose >= v2.26.1

下载docker:

官方下载方式:https://docs.docker.com/desktop/install/ubuntu/

其中 DEB package需要手动下载并传输到服务器

国内下载方式:
https://blog.csdn.net/u011278722/article/details/137673353

Ensure vm.max_map_count >= 262144:

check:
$ sysctl vm.max_map_count

Reset vm.max_map_count to a value at least 262144 if it is not:
$ sudo sysctl -w vm.max_map_count=262144

This change will be reset after a system reboot. To ensure your change remains permanent, add or update the vm.max_map_count value in /etc/sysctl.conf accordingly:
$ vm.max_map_count=262144

Clone the repo:
$ git clone https://github.com/infiniflow/ragflow.git
该步骤需要手动下载并传输,国内无法下载

Build the pre-built Docker images and start up the server:
$ cd ragflow/docker
$ chmod +x ./entrypoint.sh
$ docker compose up -d
这一步也需要手动传输或直接用用源代码build(见最后)

Check the server status after having the server up and running:
$ docker logs -f ragflow-server

The following output confirms a successful launch of the system:
____ ______ __
/ __ \ ____ _ ____ _ / // / _ __
/ // // __ // __ // / / // __ | | /| / /
/ , // // // // // / / // // /| |/ |/ /
/
/ || _,/ _, /// // _
/ |/|_/
/____/

  • Running on all addresses (0.0.0.0)
  • Running on http://127.0.0.1:9380
  • Running on http://x.x.x.x:9380
    INFO:werkzeug:Press CTRL+C to quit

In your web browser, enter the IP address of your server and log in to RAGFlow.

With the default settings, you only need to enter http://IP_OF_YOUR_MACHINE (sans port number) as the default HTTP serving port 80 can be omitted when using the default configurations.

In service_conf.yaml, select the desired LLM factory in user_default_llm and update the API_KEY field with the corresponding API key.

See llm_api_key_setup for more information.

Rebuild:

To build the Docker images from source:
$ git clone https://github.com/infiniflow/ragflow.git
$ cd ragflow/
$ docker build -t infiniflow/ragflow:dev .
$ cd ragflow/docker
$ chmod +x ./entrypoint.sh
$ docker compose up -d

卸载原有cuda和驱动
https://blog.alumik.cn/posts/90/#:~:text=Use%20the%20following%20command%20to%20uninstall%20a%20Toolkit,remove%20–purge%20%27%5Envidia-.%2A%27%20sudo%20apt-get%20remove%20–purge%20%27%5Elibnvidia-.%2A%27

CUDA 和 Nvdia driver安装:
https://blog.hellowood.dev/posts/ubuntu-22-%E5%AE%89%E8%A3%85-nvdia-%E6%98%BE%E5%8D%A1%E9%A9%B1%E5%8A%A8%E5%92%8C-cuda/

下载Vllm
https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html

国内下载model: /Qwen2-7B-Instruct方法:
pip install modelscope
from modelscope import snapshot_download
model_dir = snapshot_download(‘qwen/Qwen2-7B-Instruct’, cache_dir=‘/home/llmlocal/qwen/qwen/’)

运行llm服务器
python -m vllm.entrypoints.openai.api_server --model /home/llmlocal/qwen/qwen/Qwen2-7B-Instruct --host 0.0.0.0 --port 8000

测试:
curl http://localhost:8000/v1/chat/completions -H “Content-Type: application/json” -d ‘{
“model”: “/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct”,
“messages”: [
{“role”: “system”, “content”: “You are a helpful assistant.”},
{“role”: “user”, “content”: “Tell me something about large language models.”}
],
“temperature”: 0.7,
“top_p”: 0.8,
“repetition_penalty”: 1.05,
“max_tokens”: 512
}’

更改ragflow的MODEL_NAME = “/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct” 路径在rag里的chat_model

http://www.lryc.cn/news/410625.html

相关文章:

  • 测桃花运(算姻缘)的网站系统源码
  • 电商平台优惠券
  • 内衣洗衣机多维度测评对比,了解觉飞、希亦、鲸立哪款内衣洗衣机更好
  • 数据结构和算法入门
  • 基于OpenCV C++的网络实时视频流传输——Windows下使用TCP/IP编程原理
  • (BS ISO 11898-1:2015)CAN_FD 总线协议详解6- PL(物理层)规定3
  • docker环境下php安装扩展步骤 以mysqli为例
  • 医院综合绩效核算系统,绩效核算系统源码,采用springboot+avue+MySQL技术开发,可适应医院多种绩效核算方式。
  • ROOM数据快速入门
  • 刷新,前面接口的返回值没有到,第二个接口已经请求完了,导致第二个接口返回数据错误
  • pdcj设计
  • 【数据结构】哈希表的模拟实现
  • 面试经典算法150题系列-数组/字符串操作之多数元素
  • 海南云亿商务咨询有限公司领航抖音电商服务
  • C#初级——继承
  • Github 2024-07-29 开源项目日报 Top10
  • nginx反向代理和负载均衡+安装jdk-22.0.2
  • 软考高级科目怎么选?软考高级含金量排序
  • 【机器学习西瓜书学习笔记——模型评估与选择】
  • vue3+cesium创建地图
  • Zookeeper客户端和服务端NIO网络通信源码剖析
  • 从DevOps到DevSecOps是怎样之中转变?
  • ORM与第三方数据库对接的探讨及不同版本数据库的影响
  • Windows远程桌面无法拷贝文件问题
  • 优化数据处理效率,解读 EasyMR 大数据组件升级
  • 并发编程AtomicInteger详解
  • ctfshow 权限维持 web670--web679
  • 职场生存指南
  • Spring源码(八)--Spring实例化的策略
  • 部署KVM虚拟化平台