当前位置：首页 > news >正文

Docker Model Runner Chat

news 2025/7/7 15:58:32

Docker Model Runner是一个AI推理引擎，提供来自不同提供商的各种模型。

Spring AI通过重用现有的OpenAI支持的ChatClient与Docker Model Runner集成。为此，将基本URL设置为localhost:12434/engines，并选择提供的LLM模型之一。

查看DockerModelRunnerWithOpenAiChatModelIT.java测试，了解如何将Docker Model Runner与Spring AI结合使用。

Prerequisite

下载适用于Mac 4.40.0的Docker桌面。

选择以下选项之一以启用模型运行器：

选项1：

启用模型运行器docker桌面启用模型运行程序--tcp 12434。

将基本url设置为localhost:12434/engines

选项2：

启用模型运行器docker桌面启用模型运行程序。

使用测试容器并按如下方式设置基本url：

@Container
private static final SocatContainer socat = new SocatContainer().withTarget(80, "model-runner.docker.internal");@Bean
public OpenAiApi chatCompletionApi() {var baseUrl = "http://%s:%d/engines".formatted(socat.getHost(), socat.getMappedPort(80));return OpenAiApi.builder().baseUrl(baseUrl).apiKey("test").build();
}

您可以通过阅读使用Docker在本地运行LLMs的博客文章来了解更多关于Docker模型运行器的信息。

Auto-configuration

Spring AI启动器模块的工件ID自1.0.0.M7版本以来已被重命名。依赖项名称现在应该遵循模型、向量存储和MCP启动器的更新命名模式。有关更多信息，请参阅升级说明。

Spring AI为OpenAI聊天客户端提供Spring Boot自动配置。要启用它，请将以下依赖项添加到项目的Maven pom.xml文件中：

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-starter-model-openai</artifactId>
</dependency>

或者将以下内容添加到Gradle build.Gradle构建文件中。

dependencies {implementation 'org.springframework.ai:spring-ai-starter-model-openai'
}

请参阅依赖关系管理部分，将Spring AI BOM添加到构建文件中。

Chat Properties

Retry Properties

前缀spring.ai.retry用作属性前缀，允许您为OpenAI聊天模型配置重试机制。

spring.ai.retry.max-attempts	Maximum number of retry attempts.	10
spring.ai.retry.backoff.initial-interval	Initial sleep duration for the exponential backoff policy.	2 sec.
spring.ai.retry.backoff.multiplier	Backoff interval multiplier.	5
spring.ai.retry.backoff.max-interval	Maximum backoff duration.	3 min.
spring.ai.retry.on-client-errors	If false, throw a NonTransientAiException, and do not attempt retry for `4xx` client error codes	false
spring.ai.retry.exclude-on-http-codes	List of HTTP status codes that should not trigger a retry (e.g. to throw NonTransientAiException).	empty
spring.ai.retry.on-http-codes	List of HTTP status codes that should trigger a retry (e.g. to throw TransientAiException).	empty

Connection Properties

前缀spring.ai.openai用作属性前缀，允许您连接到openai。

spring.ai.openai.base-url	The URL to connect to. Must be set to `hub.docker.com/u/ai`	-
spring.ai.openai.api-key	Any string	-

Configuration Properties

启用和禁用聊天自动配置现在是通过前缀为spring.ai.model.chat的顶级属性完成的。

要启用，spring.ai.model.chat=openai（默认情况下已启用）

要禁用，spring.ai.model.chat=none（或任何与openai不匹配的值）

此更改允许在应用程序中配置多个模型。

前缀spring.ai.openai.chat是属性前缀，允许您为openai配置聊天模型实现。

spring.ai.openai.base-url	The URL to connect to. Must be set to `hub.docker.com/u/ai`	-
spring.ai.openai.api-key	Any string	-

所有前缀为spring.ai.openai.chat.options的属性都可以在运行时通过向Prompt调用添加特定于请求的runtime options来覆盖。

Runtime Options

OpenAiChatOptions.java提供了模型配置，例如要使用的模型、温度、频率惩罚等。

启动时，可以使用OpenAiChatModel（api，options）构造函数或spring.ai.openai.chat.options.*属性配置默认选项。

在运行时，您可以通过向Prompt调用添加新的、特定于请求的选项来覆盖默认选项。例如，要覆盖特定请求的默认型号和温度：

ChatResponse response = chatModel.call(new Prompt("Generate the names of 5 famous pirates.",OpenAiChatOptions.builder().model("ai/gemma3:4B-F16").build()));

除了特定于模型的OpenAiChatOptions之外，您还可以使用使用ChatOptions#builder（）创建的可移植ChatOptions实例。

Function Calling

Docker Model Runner支持在选择支持它的模型时调用工具/函数。

您可以在ChatModel中注册自定义Java函数，并让提供的模型智能地选择输出一个包含参数的JSON对象来调用一个或多个注册的函数。这是一种将LLM功能与外部工具和API连接起来的强大技术。

Tool Example

下面是一个简单的例子，说明如何在Spring AI中使用Docker Model Runner函数调用：

spring.ai.openai.api-key=test
spring.ai.openai.base-url=http://localhost:12434/engines
spring.ai.openai.chat.options.model=ai/gemma3:4B-F16

@SpringBootApplication
public class DockerModelRunnerLlmApplication {public static void main(String[] args) {SpringApplication.run(DockerModelRunnerLlmApplication.class, args);}@BeanCommandLineRunner runner(ChatClient.Builder chatClientBuilder) {return args -> {var chatClient = chatClientBuilder.build();var response = chatClient.prompt().user("What is the weather in Amsterdam and Paris?").functions("weatherFunction") // reference by bean name..call().content();System.out.println(response);};}@Bean@Description("Get the weather in location")public Function<WeatherRequest, WeatherResponse> weatherFunction() {return new MockWeatherService();}public static class MockWeatherService implements Function<WeatherRequest, WeatherResponse> {public record WeatherRequest(String location, String unit) {}public record WeatherResponse(double temp, String unit) {}@Overridepublic WeatherResponse apply(WeatherRequest request) {double temperature = request.location().contains("Amsterdam") ? 20 : 25;return new WeatherResponse(temperature, request.unit);}}
}

在这个例子中，当模型需要天气信息时，它会自动调用weatherFunction bean，然后可以获取实时天气数据。预期的反应是：“阿姆斯特丹的天气目前是20摄氏度，巴黎的天气目前为25摄氏度。”

关于OpenAI函数调用的信息。

Sample Controller

创建一个新的Spring Boot项目，并将Spring ai starter模型openai添加到pom（或gradle）依赖项中。

在src/main/resources目录下添加一个application.properties文件，以启用和配置OpenAi聊天模型：

spring.ai.openai.api-key=test
spring.ai.openai.base-url=http://localhost:12434/engines
spring.ai.openai.chat.options.model=ai/gemma3:4B-F16# Docker Model Runner doesn't support embeddings, so we need to disable them.
spring.ai.openai.embedding.enabled=false

这是一个使用聊天模型生成文本的简单@Controller类的示例。

@RestController
public class ChatController {private final OpenAiChatModel chatModel;@Autowiredpublic ChatController(OpenAiChatModel chatModel) {this.chatModel = chatModel;}@GetMapping("/ai/generate")public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {return Map.of("generation", this.chatModel.call(message));}@GetMapping("/ai/generateStream")public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {Prompt prompt = new Prompt(new UserMessage(message));return this.chatModel.stream(prompt);}
}

查看全文

http://www.lryc.cn/news/581799.html