当前位置：首页 > news >正文

milvus向量数据库连接测试和集合维度不同搜索不到内容

news 2025/7/18 10:44:56

1.连接测试

import random
import time
from pymilvus import (connections,utility,FieldSchema, CollectionSchema, DataType,Collection,
)# 定义测试集合名称和参数
COLLECTION_NAME = "test_collection"
DIMENSION = 128  # 向量维度
INDEX_FILE_SIZE = 32  # 索引文件大小
METRIC_TYPE = "L2"  # 距离度量类型：欧氏距离
INDEX_TYPE = "IVF_FLAT"  # 索引类型
NLIST = 1024  # IVF 索引的聚类数
NPROBE = 16  # 搜索时探测的聚类数
TOP_K = 5  # 搜索返回的最近邻数量def connect_to_milvus():"""连接到Milvus服务器"""print("连接到 Milvus 服务器...")try:connections.connect("default", host="localhost", port="19530")print("连接成功!")return Trueexcept Exception as e:print(f"连接失败: {e}")return Falsedef create_collection():"""创建集合及其字段"""if utility.has_collection(COLLECTION_NAME):utility.drop_collection(COLLECTION_NAME)print(f"已删除现有集合: {COLLECTION_NAME}")# 定义集合字段fields = [FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False),FieldSchema(name="random_value", dtype=DataType.DOUBLE),FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=DIMENSION)]# 创建集合模式schema = CollectionSchema(fields, description="测试集合")# 创建集合collection = Collection(name=COLLECTION_NAME, schema=schema)print(f"集合 '{COLLECTION_NAME}' 创建成功")return collectiondef insert_data(collection, num_entities=10):"""向集合中插入向量数据"""# 生成一些随机数据entities = [# id 字段[i for i in range(num_entities)],# random_value 字段[random.random() for _ in range(num_entities)],# embedding 字段 (向量)[[random.random() for _ in range(DIMENSION)] for _ in range(num_entities)]]# 插入数据insert_result = collection.insert(entities)# 数据插入后需要刷新集合以确保数据可用于搜索collection.flush()print(f"成功插入 {insert_result.insert_count} 条记录")return insert_resultdef create_index(collection):"""为集合创建索引"""# 创建索引index_params = {"metric_type": METRIC_TYPE,"index_type": INDEX_TYPE,"params": {"nlist": NLIST}}print(f"正在为 'embedding' 字段创建 {INDEX_TYPE} 索引...")collection.create_index("embedding", index_params)print("索引创建成功!")def perform_search(collection, search_vectors):"""执行向量搜索"""# 加载集合到内存collection.load()# 设置搜索参数search_params = {"metric_type": METRIC_TYPE, "params": {"nprobe": NPROBE}}# 执行搜索results = collection.search(data=search_vectors,      # 要搜索的向量anns_field="embedding",   # 要在其上执行搜索的字段param=search_params,      # 搜索参数limit=TOP_K,              # 返回的最近邻数量output_fields=["random_value"]  # 要返回的额外字段)return resultsdef main():"""主测试函数"""# 连接到 Milvus 服务器if not connect_to_milvus():return# 创建测试集合collection = create_collection()# 插入数据insert_data(collection, num_entities=100)# 创建索引create_index(collection)# 生成一些搜索向量vectors_to_search = [[random.random() for _ in range(DIMENSION)] for _ in range(2)]# 执行向量搜索results = perform_search(collection, vectors_to_search)# 打印搜索结果for i, hits in enumerate(results):print(f"搜索向量 {i} 的结果：")for hit in hits:print(f"ID: {hit.id}, 距离: {hit.distance}, 随机值: {hit.entity.get('random_value')}")# 清理：删除集合if utility.has_collection(COLLECTION_NAME):utility.drop_collection(COLLECTION_NAME)print(f"测试完成，集合 '{COLLECTION_NAME}' 已删除")if __name__ == "__main__":main()

2.起初的搞的代码是向量维度1024和我的768不匹配，删除再建立解决

3.找不到module问题->添加到系统路径路径

向上两层

import sys
# 获取当前文件所在目录
current_dir = os.path.dirname(os.path.abspath(__file__))
#打印出来
print(f"当前目录: {current_dir}")
# 获取上一级目录（父目录）
parent_dir = os.path.dirname(current_dir)
# 打印出来
print(f"父目录: {parent_dir}")
# 获取上两级目录（父目录的父目录）
grandparent_dir = os.path.dirname(parent_dir)
# 打印出来
print(f"上两级目录: {grandparent_dir}")
# 将上两级目录添加到 Python 模块搜索路径中if grandparent_dir not in sys.path:print(f"添加上两级目录到系统路径: {grandparent_dir}")
sys.path.append(grandparent_dir)

向上一层

# 添加项目根目录到系统路径
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
sys.path.append(parent_dir)

查看全文

http://www.lryc.cn/news/591466.html

嵌入式时钟系统

C++ 返回值优化（Return Value Optimization, RVO）

c++列表初始化

MyUI轮播Carousel组件文档

Windows10笔记本电脑开启BIOS

deep learning(李宏毅)--（六）--loss

“显著性”（Saliency）是计算机视觉中的一个重要概念，主要指的是图像或视频中最吸引人注意力的区域或对象

川翔云电脑：云端算力新标杆，创作自由无边界

产品经理如何绘制流程图

4.PCL点云的数据结构

上证50etf期权交易限制的是什么？

【JAVA新特性】Java 8 新特性实战

小程序性能优化全攻略：提升用户体验的关键策略

Java List 集合详解：从基础到实战，掌握 Java 列表操作全貌

Kubernetes 学习笔记

【JEECG 组件扩展】JSwitch开关组件扩展单个多选框样式

基于pytorch深度学习笔记：1.LeNetAlexNet

XXE漏洞4-XXE无回显文件读取-PentesterLab靶场搭建

Kotlin密封类

6. 工程化实践类：《Webpack 5 性能优化全指南：从构建速度到输出质量》

如何成为高级前端开发者：系统化成长路径。

自动化测试工具 Selenium 入门指南

CTF Crypto基础知识

python(one day)——春水碧于天，画船听雨眠。

Matplotlib 轴标题与刻度字号调整方法

SGMD辛几何模态分解直接替换Excel运行包含频谱图相关系数图 Matlab语言！

多重共线性Multicollinearity

pytorch小记（三十一）：深入解析 PyTorch 权重初始化：`xavier_normal_` 与 `constant_`

cuda编程笔记（8）--线程束warp

imx6ull-系统移植篇9——bootz启动 Linux 内核

1.连接测试

2.起初的搞的代码是向量维度1024和我的768不匹配，删除再建立解决

3.找不到module问题->添加到系统路径路径

向上两层

向上一层

相关文章：