当前位置：首页 > news >正文

Spring AI 本地 RAG 实战：用Redis、Chroma搭建离线知识问答系统

news 2025/7/10 13:54:22

本文将用 Ollama + Qwen-7B 搭建离线知识问答系统（含 Redis/Chroma 向量库）

前言
环境搭建
项目结构设计
Maven 依赖pom.xml
application.yml 配置（Redis + Ollama）
Redis 向量库实战
- OllamaConfig.java
- RagService.java
- RagController.java
- RagApplication.java
- 测试样例
RAG 增强
- Maven 依赖补充（文档处理 + Multipart）
- RagService 增强（addDocumentsFromFiles）
- 新增接口 RagController 上传文档接口
- 使用 Postman / Swagger 测试
问答结果增强
总结
参考

前言

在大模型应用持续火热的今天，“RAG”（Retrieval-Augmented Generation）成为构建企业知识问答、私有文档智能检索的重要架构。然而很多开发者一开始就遇到这些问题：

API Token 限制（如 OpenAI 额度）
网络限制（如内网环境、数据敏感）
英文模型难以理解中文业务数据

于是，本地部署 LLM（如 Ollama + Qwen）成为热门选择。

Spring AI 1.0.0-M5 正式支持接入本地 LLM，使得 Java 社区开发者也能：

离线构建完整的 RAG 系统：私有数据存储 → 本地向量检索 → 本地模型回答

本教程将手把手演示如何使用：

Qwen-7B：通义千问开源中文大模型
Spring AI：统一的 LLM 接入框架
Redis 向量数据库 + Chroma 本地数据库
构建可运行、可打包的离线问答系统

环境搭建

本文采用本地 Ollama + Qwen-7B 模型，参考之前的文章里，关于如何本地搭建的部分：Spring AI 基本组件详解 —— ChatClient、Prompt、Memory

项目结构设计

在这里插入图片描述

Maven 依赖pom.xml

本例适配 Spring Boot 2.5.3 + Spring AI 1.0.0-M5：

<project><modelVersion>4.0.0</modelVersion><groupId>com.example</groupId><artifactId>spring-ai-rag-local</artifactId><version>0.0.1-SNAPSHOT</version><name>Spring AI Local RAG</name><properties><java.version>17</java.version><spring.boot.version>2.5.3</spring.boot.version><spring-ai.version>1.0.0-M5</spring-ai.version></properties><dependencies><!-- Spring Boot 核心依赖 --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><!-- Spring AI 核心 --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-ollama-spring-boot-starter</artifactId><version>${spring-ai.version}</version></dependency><!-- 嵌入模型与 VectorStore --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-redis-vector-store</artifactId><version>${spring-ai.version}</version></dependency><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-chroma-vector-store</artifactId><version>${spring-ai.version}</version></dependency><!-- Redis --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-redis</artifactId></dependency><!-- 日志与测试 --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-logging</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-test</artifactId><scope>test</scope></dependency></dependencies><build><plugins><!-- Spring Boot 插件 --><plugin><groupId>org.springframework.boot</groupId><artifactId>spring-boot-maven-plugin</artifactId></plugin></plugins></build>
</project>

application.yml 配置（Redis + Ollama）

server:port: 8080spring:ai:ollama:base-url: http://localhost:11434model: qwen:7bdata:redis:host: localhostport: 6379

Ollama 默认监听端口为 11434，Qwen 模型会按需唤起。

Redis 向量库实战

我们将实现以下几个模块：

配置类：初始化 Ollama 客户端与 VectorStore（Redis）
服务类：实现文档添加 + 问答检索逻辑
控制器：暴露 REST 接口
示例文档：测试 RAG 问答效果

OllamaConfig.java

package com.example.rag.config;import org.springframework.ai.embedding.EmbeddingClient;
import org.springframework.ai.embedding.ollama.OllamaEmbeddingClient;
import org.springframework.ai.ollama.OllamaChatClient;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.redis.RedisVectorStore;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.core.StringRedisTemplate;@Configuration
public class OllamaConfig {@Beanpublic OllamaChatClient chatClient() {return new OllamaChatClient("http://localhost:11434", "qwen:7b");}@Beanpublic EmbeddingClient embeddingClient() {return new OllamaEmbeddingClient("http://localhost:11434");}@Beanpublic VectorStore redisVectorStore(EmbeddingClient embeddingClient, StringRedisTemplate redisTemplate) {return new RedisVectorStore(embeddingClient, redisTemplate);}
}

💡 如果遇到 OllamaEmbeddingClient 无法解析，说明 Spring AI 版本不兼容，建议先使用 OpenAiEmbeddingClient 作为 placeholder。

RagService.java

package com.example.rag.service;import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.messages.ChatMessage;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.Document;
import org.springframework.stereotype.Service;import java.util.List;@Service
public class RagService {private final VectorStore vectorStore;private final ChatClient chatClient;public RagService(VectorStore vectorStore, ChatClient chatClient) {this.vectorStore = vectorStore;this.chatClient = chatClient;}// 添加文档并向量化public void addDocument(String text) {Document doc = new Document(text);vectorStore.add(List.of(doc));}// 提问并返回回答public String ask(String question) {List<Document> docs = vectorStore.similaritySearch(SearchRequest.query(question).withTopK(3));String context = docs.stream().map(Document::getContent).reduce("", (a, b) -> a + "\n" + b);Prompt prompt = new Prompt(List.of(new UserMessage("请根据以下知识回答问题：\n" + context + "\n问题是：" + question)));return chatClient.call(prompt).getResult().getOutput().getContent();}
}

RagController.java

package com.example.rag.controller;import com.example.rag.service.RagService;
import org.springframework.web.bind.annotation.*;@RestController
@RequestMapping("/api/rag")
public class RagController {private final RagService ragService;public RagController(RagService ragService) {this.ragService = ragService;}@PostMapping("/add")public String addDocument(@RequestBody String content) {ragService.addDocument(content);return "文档已添加";}@GetMapping("/ask")public String ask(@RequestParam String question) {return ragService.ask(question);}
}

RagApplication.java

package com.example.rag;import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;@SpringBootApplication
public class RagApplication {public static void main(String[] args) {SpringApplication.run(RagApplication.class, args);}
}

测试样例

# 添加文档
curl -X POST http://localhost:8080/api/rag/add \-H "Content-Type: text/plain" \-d "Spring AI 是一个可以统一封装各类大语言模型调用的 Java 框架，支持 OpenAI、Ollama、HuggingFace 等。"# 发起问答
curl "http://localhost:8080/api/rag/ask?question=Spring AI 支持哪些模型？"

RAG 增强

要求：

上传 .txt / .md 文件或多个文档
自动读取文件内容
切片（Chunk）处理
批量向量化并写入向量数据库（Redis/Chroma）

Maven 依赖补充（文档处理 + Multipart）

在 pom.xml 中加入：

<!-- 文件上传 & 文本处理 -->
<dependency><groupId>commons-io</groupId><artifactId>commons-io</artifactId><version>2.11.0</version>
</dependency>

RagService 增强（addDocumentsFromFiles）

public void addDocumentsFromFiles(List<MultipartFile> files) {List<Document> docs = new ArrayList<>();for (MultipartFile file : files) {try {String content = new String(file.getBytes(), StandardCharsets.UTF_8);List<String> chunks = splitIntoChunks(content, 300); // 每段 300 字for (String chunk : chunks) {docs.add(new Document(chunk));}} catch (IOException e) {throw new RuntimeException("无法读取文件: " + file.getOriginalFilename(), e);}}vectorStore.add(docs);
}// 简单文本切片方法
private List<String> splitIntoChunks(String text, int size) {List<String> chunks = new ArrayList<>();for (int start = 0; start < text.length(); start += size) {int end = Math.min(start + size, text.length());chunks.add(text.substring(start, end));}return chunks;
}

新增接口 RagController 上传文档接口

@PostMapping("/upload")
public String uploadFiles(@RequestParam("files") List<MultipartFile> files) {ragService.addDocumentsFromFiles(files);return "文件已成功处理并存入向量库";
}

使用 Postman / Swagger 测试

请求：

POST /api/rag/upload
请求体类型 form-data
- 参数名：files
- 类型：file
- 支持上传多个 .txt / .md / .log 等文本文件

示例返回：

文件已成功处理并存入向量库

问答结果增强

为了更透明的返回，我们调整一下 RagService 返回结构：

public Map<String, Object> askWithSources(String question) {List<Document> docs = vectorStore.similaritySearch(SearchRequest.query(question).withTopK(3));String context = docs.stream().map(Document::getContent).collect(Collectors.joining("\n"));Prompt prompt = new Prompt(List.of(new UserMessage("根据以下知识回答问题：\n" + context + "\n用户问题：" + question)));String answer = chatClient.call(prompt).getResult().getOutput().getContent();Map<String, Object> result = new LinkedHashMap<>();result.put("answer", answer);result.put("sources", docs.stream().map(Document::getContent).collect(Collectors.toList()));return result;
}

对应接口：

@GetMapping("/ask-v2")
public Map<String, Object> askWithSources(@RequestParam String question) {return ragService.askWithSources(question);
}

总结

功能模块	说明
本地模型	使用 Ollama + Qwen-7B，适配中文
多向量数据库	Redis、Chroma 双向支持
文本输入接口	支持纯文本输入、批量切片
文件上传接口	支持多文件上传（txt/md/log）
问答接口	支持用户提问 + LLM 回答
源内容返回	可返回匹配文档内容
RESTful 风格	接口标准、易集成前端
离线部署	零外网依赖，适合内网环境
可扩展结构	支持未来加入模型切换、API Gateway、消息队列