当前位置：首页 > news >正文

Grounded-SAM Demo部署搭建

news 2025/8/26 22:04:25

1 环境部署

2 Grounded-SAM Demo安装

3 运行Demo

3.1 运行Gradio APP

3.2 Gradio APP操作

1 环境部署

由于SAM建议使用CUDA 11.3及以上版本，这里使用CUDA 11.4版本。

另外，由于整个SAM使用的是Pytorch开发，因此需要Python环境，这里使用conda环境，能够更方便地安装各依赖库。Python版本要求3.8及以上，这里使用Python3.8。使用conda安装Python3.8的过程如下。

（1）安装Anaconda

下载Anaconda

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-5.3.1-Linux-x86_64.sh

安装Anaconda

bash Anaconda3-5.3.1-Linux-x86_64.sh

（2）创建python3.8虚拟环境

conda create -n SAM python=3.8

（3）激活python3.8虚拟环境

source activate SAM

2 Grounded-SAM Demo安装

（1）下载Grounded-SAM

git clone https://github.com/IDEA-Research/Grounded-Segment-Anything.git

（2）安装Grounded-SAM

cd Grounded-Segment-Anything

1> 安装Segment Anything:

python -m pip install -e segment_anything

2> 安装Grounding DINO:

python -m pip install -e GroundingDINO

3> 安装diffusers:

pip install --upgrade diffusers[torch]

4> 安装osx:

git submodule update --init --recursive
cd grounded-sam-osx && bash install.sh

5> 安装Tag2Text

git submodule update --init --recursive
cd Tag2Text && pip install -r requirements.txt

6> 安装其他可选组件

 pip install opencv-python pycocotools matplotlib onnxruntime onnx ipykernel

3 运行Demo

Grounded-SAM提供了以下多种Demo，通过单个或多个不同的大模型组合提供更强大的视觉处理功能。

（1）GroundingDINO：使用文本提示检测所有内容。

GroundingDINO + Segment-Anything：使用文本提示检测和分割所有内容。

（2）GroundingDINO + Segment-Anything + Stable-Diffusion：使用文本提示检测、分割和生成任何内容。

（3）Grounded-SAM + Stable-Diffusion Gradio APP：一个包含文本提示和全自动检测、分割和生成任何内容的Web服务。

（4）Grounded-SAM + Tag2Text：具有优秀图像标记功能的的自动标注系统。

（5）Grounded-SAM + BLIP：自动标注系统。

（6）Whisper + Grounded-SAM：用语音检测和分割所有内容。

（7）Grounded-SAM + Visual ChatGPT：使用ChatBot自动标注和生成所有内容。

（8）Grounded-SAM + OSX：文本到3D全身网格恢复，检测任何人并重建其3D人类网格。

（9）交互式时尚编辑游乐场：点击进行分割和编辑。

（10）交互式人脸编辑游乐场：点击并编辑人脸。

下面以下搭建Gradio APP为例说明Demo的搭建步骤

3.1 运行Gradio APP

（1）安装grdio

pip install gradio

（2）升级transformers

pip install --upgrade transformers

（3）安装openai

pip install openai

（4）模型下载

这里用到两个大模型，一个是SAM的分割一切的大模型sam_vit_h_4b8939.pth，另一个是检测一切的大模型groundingdino_swint_ogc.pth。

cd Grounded-Segment-Anything
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

（4）运行Gradio APP

python gradio_app.py

3.2 Gradio APP操作

Gradio APP启动后是以Web服务形式提供功能，可以通过浏览器直接访问服务端，默认端口为7589。

Gradio APP提供的6种task_type模式如下：

（1）scribble：通过Segment Anything和鼠标点击交互实现分割（您需要用鼠标点击对象，无需指定提示）。

（2）automask：通过Segment Anything一次性分割整个图像（无需指定提示）。

（3）det：通过Grounding DINO和文本交互实现检测（需要指定文本提示）。

（4）seg：通过结合Grounding DINO和Segment Anything实现文本交互，实现检测+分割（需要指定文本提示）。

（5）inpainting：通过结合Grounding DINO + Segment Anything + Stable Diffusion实现文本交换并替换目标对象（需要指定文本提示和inpaint提示）。

（6）automatic：通过结合BLIP + Grounding DINO + Segment Anything实现非交互式检测+分割（无需指定提示）。

查看全文

http://www.lryc.cn/news/459580.html

C语言 | 第十六章 | 共用体家庭收支软件-1

【论文阅读】Learning a Few-shot Embedding Model with Contrastive Learning

OKHTTP 如何处理请求超时和重连机制

基于Springboot vue的流浪狗领养管理系统设计与实现

爬虫案例——网易新闻数据的爬取

SpringCloud 2023 Gateway的Filter配置介绍、类型、内置过滤器、自定义全局和单一内置过滤器

从银幕到现实：擎天柱机器人即将改变我们的生活

408算法题leetcode--第33天

OCR模型调研及详细安装

C++第六讲：STL--vector的使用及模拟实现

2024年字节抖音前端面经，这次问的很基础！

vscode提交修改Failed to connect to github.com port 443: Timed out

通过docker镜像安装elasticsearch和kibana

seaCMS v12.9代码审计学习（下半）

麒麟信安CentOS安全加固案例获评中国信通院第三届“鼎新杯”数字化转型应用奖

Java 中消除 If-else 技巧总结

每个平台团队都应该跟踪的API指标

Windows 11 24H2版本有哪些新功能_Windows 11 24H2十四大新功能介绍

渗透测试之 AD域渗透【Kerberoasting】攻击技术讲解对应得工具详细介绍哟~ 以及相关示例按照步骤做你也会哟

如何在Ubuntu上更改MySQL数据存储路径

Cortex-M 内核的 OS 特性

第十六章 RabbitMQ延迟消息之延迟插件优化

[单master节点k8s部署]32.ceph分布式存储（三）

git 相关问题解决一一记录

UE4 材质学习笔记04（着色器性能优化）

3、Redis Stack扩展功能

Flythings学习（二）控件相关

关于multiprocessing使用freeze_support()方法

基于rk356x u-boot版本功能分析及编译相关（一）

Jenkins---01

1 环境部署

2 Grounded-SAM Demo安装

3 运行Demo

3.1 运行Gradio APP

3.2 Gradio APP操作

相关文章：