当前位置: 首页 > news >正文

使用transformers调用owlv2实现开放目标检测

目录

  • 安装
  • Demo

安装

pip install transformers

Demo

from PIL import Image, ImageDraw, ImageFont
import numpy as np
import torch
from transformers import AutoProcessor, Owlv2ForObjectDetection
from transformers.utils.constants import OPENAI_CLIP_MEAN, OPENAI_CLIP_STDprocessor = AutoProcessor.from_pretrained("/home/share3/mayunchuan/google/owlv2-large-patch14-ensemble")
model = Owlv2ForObjectDetection.from_pretrained("/home/share3/mayunchuan/google/owlv2-large-patch14-ensemble").cuda()image = Image.open('/home/mayunchuan/lavad/dataset/Thumos14_25fps/frames/video_test_0000293/004902.jpg')
# image = Image.open('/home/mayunchuan/lavad/dataset/Thumos14_25fps/frames/video_validation_0000990/001388.jpg')
# texts = [["a photo of a volleyball", "a photo of a man"]]
texts = [[" javelin"]]
inputs = processor(text=texts, images=image, return_tensors="pt")
inputs['input_ids'] = inputs['input_ids'].cuda()
inputs['attention_mask'] = inputs['attention_mask'].cuda()
inputs['pixel_values'] = inputs['pixel_values'].cuda()
# forward pass
with torch.no_grad():outputs = model(**inputs)# Note: boxes need to be visualized on the padded, unnormalized image
# hence we'll set the target image sizes (height, width) based on thatdef get_preprocessed_image(pixel_values):pixel_values = pixel_values.squeeze().cpu().numpy()unnormalized_image = (pixel_values * np.array(OPENAI_CLIP_STD)[:, None, None]) + np.array(OPENAI_CLIP_MEAN)[:, None, None]unnormalized_image = (unnormalized_image * 255).astype(np.uint8)unnormalized_image = np.moveaxis(unnormalized_image, 0, -1)unnormalized_image = Image.fromarray(unnormalized_image)return unnormalized_imageunnormalized_image = get_preprocessed_image(inputs.pixel_values)target_sizes = torch.Tensor([unnormalized_image.size[::-1]])
# Convert outputs (bounding boxes and class logits) to final bounding boxes and scores
results = processor.post_process_object_detection(outputs=outputs, threshold=0.2, target_sizes=target_sizes
)i = 0  # Retrieve predictions for the first image for the corresponding text queries
text = texts[i]
boxes, scores, labels = results[i]["boxes"], results[i]["scores"], results[i]["labels"]for box, score, label in zip(boxes, scores, labels):box = [round(i, 2) for i in box.tolist()]print(f"Detected {text[label]} with confidence {round(score.item(), 3)} at location {box}")# 绘制边界框
draw = ImageDraw.Draw(unnormalized_image)for score, label, box in zip(scores, labels, boxes):box = [round(i, 2) for i in box.tolist()]x, y, x2, y2 = tuple(box)draw.rectangle((x, y, x2, y2), outline="red", width=1)draw.text((x, y), text[label.item()], font_size=20, fill="black")# 保存标记好的图片
unnormalized_image.save("marked_image.jpg")
http://www.lryc.cn/news/452656.html

相关文章:

  • 大数据技术:Hadoop、Spark与Flink的框架演进
  • Spring Boot框架下的新闻推荐技术
  • 相亲交友系统的社会影响:家庭结构的变化
  • C++ 内存池(Memory Pool)详解
  • css三角形:css画箭头向下的三角形
  • CSS属性 - animation
  • 昇思MindSpore进阶教程--在ResNet-50网络上应用二阶优化实践(下)
  • 基于大数据的Python+Django电影票房数据可视化分析系统设计与实现
  • 实景三维技术对光伏产业的发展具有哪些优势?
  • 四非人的保研之路,2024(2025届)四非计算机的保研经验分享(西南交通、苏大nlp、西电、北邮、山软、山计、电科、厦大等)
  • UE5.4.3 录屏回放系统ReplaySystem蓝图版
  • ECCV 2024 | 融合跨模态先验与扩散模型,快手处理大模型让视频画面更清晰!
  • 9--苍穹外卖-SpringBoot项目中Redis的介绍及其使用实例 详解
  • 【EXCEL数据处理】000014 案例 EXCEL分类汇总、定位和创建组。附多个操作案例。
  • Windows环境Apache httpd 2.4 web服务器加载PHP8:Hello,world!
  • Spring框架使用Api接口实现AOP的切面编程、两种方式的程序示例以及Java各数据类型及基本数据类型的默认值/最大值/最小值列表
  • 【达梦数据库】尽可能 disql 的使用效果与异构数据库一致
  • 【研1深度学习】《神经网络和深度学习》阅读笔记(记录中......
  • 十一不停歇-学习ROS2第一天 (10.2 10:45)
  • Java高效编程(14):考虑实现 `Comparable
  • 华为昇腾CANN训练营2024第二季--Ascend C算子开发能力认证(中级)题目和经验分享
  • 实战OpenCV之形态学操作
  • 矩阵的特征值和特征向量
  • (11)MATLAB莱斯(Rician)衰落信道仿真2
  • ComfyUI局部重绘换衣讲解
  • Android——添加联系人
  • 高级 Java Redis 客户端 有哪些?
  • jenkins项目发布基础
  • 前缀和算法详解
  • Android-Handle消息传递和线程通信