当前位置: 首页 > news >正文

问题解决:Problem exceeding maximum token in azure openai (with java)

问题背景:

I'm doing a chat that returns queries based on the question you ask it in reference to a specific database. For this I use azure openai and Java in Spring Boot.

我正在开发一个聊天功能,该功能根据您针对特定数据库的提问返回查询结果。为此,我使用了Azure OpenAI和Spring Boot中的Java。

My problem comes here:

How can I make the AI remember the previous questions without passing the context back to it (what I want to do is greatly reduce the consumption of tokens, since depending on what it asks, if the question contains a keyword, for example 'users', what I do is pass in the context the information in this table that is huge (name of the fields, type of data and description) so when you have several questions the use of tokens rises to more than 10,000))

我如何能让AI记住之前的问题,而不需要将上下文再次传递给它(我想做的是大大减少令牌的消耗,因为根据AI提出的问题,如果问题中包含一个关键字,例如“用户”,我会在上下文中传递这个巨大表格的信息(字段名、数据类型和描述),所以当你有多个问题时,令牌的使用量会上升到超过10,000个))

I can't show all the code since it's a project for my company.

由于这是我们公司的一个项目,我不能展示所有的代码。

What im currently doing is adding to the context the referenced table and the principal context(you are a based SQL chat...). And for the chat to remember, I have tried to save the history in java and pass the context history again(but this exceed the tokens pretty fast)

我目前所做的是向上下文中添加引用的表格和主要上下文(例如“您是一个基于SQL的聊天...”)。为了让聊天能够记住之前的对话,我试图在Java中保存历史记录并再次传递上下文历史(但这很快就会超过令牌限制)。

This is what I'm currently doing (no remembering from the AI):

这是我现在的做法(AI不会记住之前的对话)

chatMessages.add(new ChatMessage(ChatRole.SYSTEM, context));chatMessages.add(new ChatMessage(ChatRole.USER, question));ChatCompletions chatCompletions = client.getChatCompletions(deploymentOrModelId, new ChatCompletionsOptions(chatMessages));

问题解决:

As far as I know, there is no way to make the LLM (Azure OpenAI in this case) remember your context cheaply, as you said, sending context (and a huge chunk of it) on each call gets pricy really fast. That been said, you could change the approach and try other techniques to mimic that the AI has memory like summarizing the previous questions and send that as content (instead of a long string with 20 questions/answers, you send a short summary of what the user has been asking for. it will keep your prompt short and kind of "aware" of the conversation.

据我所知,确实没有便宜的方法让大型语言模型(在这种情况下是Azure OpenAI)记住上下文,正如您所说,每次调用时发送上下文(特别是大量的上下文)会很快变得昂贵。话虽如此,您可以改变方法并尝试其他技术来模拟AI具有记忆的功能,比如总结之前的问题并将其作为内容发送(而不是发送包含20个问答的长字符串,您发送一个用户一直在询问的内容的简短摘要)。这将使您的提示保持简短,并使AI对对话保持“意识”。

There are also conversation buffers (keeping the chat history in memory and send it to de llm each time as you did) but it gets long pretty fast, for that you could configure a buffer window (limiting the memory of the conversation to the last 3 questions for example, that should help keep the token count manageable).

还有对话缓冲区(将聊天历史保存在内存中,并在每次调用时像您之前所做的那样发送给LLM),但对话历史很快就会变得很长。为此,您可以配置一个缓冲区窗口(例如,将对话的内存限制为最后3个问题),这有助于将令牌数量控制在可管理的范围内。

There are several ways to manage this but there is no "perfect memory" as far as I know, not one the is worth paying. If you could tell us a bit more on how good the bot memory needs to be or the specific use case, maybe we can be more precise. Good luck!

管理这种情况有几种方法,但据我所知,没有“完美的记忆”,至少没有一种值得为此付费的。如果您能告诉我们机器人需要多好的记忆能力,或者具体的使用场景,我们可能能给出更精确的建议。祝您好运!

http://www.lryc.cn/news/378194.html

相关文章:

  • eNSP学习——OSPF在帧中继网络中的配置
  • PHP转Go系列 | 条件循环的使用姿势
  • 八大经典排序算法
  • 【LeetCode热题 100】三数之和
  • 【深度学习驱动流体力学】完整配置安装 OpenFOAM 及其所需的ThirdParty与QT5工具
  • YOLOv10改进 | Neck | 添加双向特征金字塔BiFPN【含二次独家创新】
  • PostgreSQL源码分析——pg_basebackup
  • QT基础 - 常见图表绘制
  • 解释React中的“端口(Portals)”是什么,以及如何使用它来渲染子节点到DOM树以外的部分。
  • java实现分类下拉树,点击时对应搜索---后端逻辑
  • 【2024最新华为OD-C/D卷试题汇总】[支持在线评测] 披萨大作战(100分) - 三语言AC题解(Python/Java/Cpp)
  • 探索Facebook对世界各地文化的影响
  • 导出requirements.txt
  • 我主编的电子技术实验手册(09)——并联电路
  • 数据结构_二叉树
  • Java线程池七个参数详解
  • 产品Web3D交互展示有什么优势?如何快速制作?
  • Python | Leetcode Python题解之第171题Excel列表序号
  • 【银河麒麟】高可用触发服务器异常重启,处理机制详解
  • 性能工具之 JMeter 常用组件介绍(七)
  • Python学习笔记15:进阶篇(四)文件的读写。
  • 角度调制与解调电路
  • 数据分析:微生物组差异丰度方法汇总
  • Linux驱动开发(二)--字符设备驱动开发提升 LED驱动开发实验
  • 钡铼BL101网关助力智慧城市路灯远程智能管控
  • 如何优雅的使用Github Action服务来将Hexo部署到Github Pages
  • After Effects 2024 mac/win版:创意视效,梦想起航
  • 信息打点web篇----web后端源码专项收集
  • ArcGIS批量投影转换的妙用(地理坐标系转换为平面坐标系)
  • YOLOv10训练自己的数据集(图像目标检测)