当前位置: 首页 > news >正文

transformers-Generation with LLMs

https://huggingface.co/docs/transformers/main/en/llm_tutorialicon-default.png?t=N7T8https://huggingface.co/docs/transformers/main/en/llm_tutorial停止条件是由模型决定的,模型应该能够学习何时输出一个序列结束(EOS)标记。如果不是这种情况,则在达到某个预定义的最大长度时停止生成。

from transformers import AutoModelForCausalLMmodel = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1", device_map="auto", load_in_4bit=True
)
from transformers import AutoTokenizertokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1", padding_side="left")
model_inputs = tokenizer(["A list of colors: red, blue"], return_tensors="pt").to("cuda")
generated_ids = model.generate(**model_inputs)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'A list of colors: red, blue, green, yellow, orange, purple, pink,'
tokenizer.pad_token = tokenizer.eos_token  # Most LLMs don't have a pad token by default
model_inputs = tokenizer(["A list of colors: red, blue", "Portugal is"], return_tensors="pt", padding=True
).to("cuda")
generated_ids = model.generate(**model_inputs)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
['A list of colors: red, blue, green, yellow, orange, purple, pink,',
'Portugal is a country in southwestern Europe, on the Iber']

生成策略有很多,

生成结果太短或太长

如果在GenerationConfig文件中未指定,则默认情况下generate返回最多20个标记。建议在generate调用中手动设置max_new_tokens来控制它可以返回的最大新标记数。请注意,LLM(更精确地说是仅解码器模型)还将输入提示作为输出的一部分返回。

model_inputs = tokenizer(["A sequence of numbers: 1, 2"], return_tensors="pt").to("cuda")# By default, the output will contain up to 20 tokens
generated_ids = model.generate(**model_inputs)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'A sequence of numbers: 1, 2, 3, 4, 5'# Setting `max_new_tokens` allows you to control the maximum length
generated_ids = model.generate(**model_inputs, max_new_tokens=50)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'A sequence of numbers: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,'

生成模式不正确

默认情况下,generate在每次迭代中选择最可能的标记(greedy decoding),除非在GenerationConfig文件中指定。

# Set seed or reproducibility -- you don't need this unless you want full reproducibility
from transformers import set_seed
set_seed(42)model_inputs = tokenizer(["I am a cat."], return_tensors="pt").to("cuda")# LLM + greedy decoding = repetitive, boring output
generated_ids = model.generate(**model_inputs)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'I am a cat. I am a cat. I am a cat. I am a cat'# With sampling, the output becomes more creative!
generated_ids = model.generate(**model_inputs, do_sample=True)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'I am a cat.  Specifically, I am an indoor-only cat.  I'

边缘填充错误

LLM是仅解码器架构,这意味着它们会继续对输入提示进行迭代。如果您的输入长度不相同,那么它们需要被填充。由于LLM没有被训练以从填充标记继续生成,因此输入需要进行左填充。确保还记得将注意力掩码传递给generate函数!

# The tokenizer initialized above has right-padding active by default: the 1st sequence,
# which is shorter, has padding on the right side. Generation fails to capture the logic.
model_inputs = tokenizer(["1, 2, 3", "A, B, C, D, E"], padding=True, return_tensors="pt"
).to("cuda")
generated_ids = model.generate(**model_inputs)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'1, 2, 33333333333'# With left-padding, it works as expected!
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1", padding_side="left")
tokenizer.pad_token = tokenizer.eos_token  # Most LLMs don't have a pad token by default
model_inputs = tokenizer(["1, 2, 3", "A, B, C, D, E"], padding=True, return_tensors="pt"
).to("cuda")
generated_ids = model.generate(**model_inputs)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'1, 2, 3, 4, 5, 6,'

错误的prompt

一些模型和任务需要特定的输入提示格式才能正常工作。如果未使用该格式,性能可能会出现悄然下降:模型可以运行,但效果不如按照预期的提示进行操作。

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-alpha")
model = AutoModelForCausalLM.from_pretrained("HuggingFaceH4/zephyr-7b-alpha", device_map="auto", load_in_4bit=True
)
set_seed(0)
prompt = """How many helicopters can a human eat in one sitting? Reply as a thug."""
model_inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
input_length = model_inputs.input_ids.shape[1]
generated_ids = model.generate(**model_inputs, max_new_tokens=20)
print(tokenizer.batch_decode(generated_ids[:, input_length:], skip_special_tokens=True)[0])
"I'm not a thug, but i can tell you that a human cannot eat"
# Oh no, it did not follow our instruction to reply as a thug! Let's see what happens when we write
# a better prompt and use the right template for this model (through `tokenizer.apply_chat_template`)set_seed(0)
messages = [{"role": "system","content": "You are a friendly chatbot who always responds in the style of a thug",},{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
model_inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
input_length = model_inputs.shape[1]
generated_ids = model.generate(model_inputs, do_sample=True, max_new_tokens=20)
print(tokenizer.batch_decode(generated_ids[:, input_length:], skip_special_tokens=True)[0])
'None, you thug. How bout you try to focus on more useful questions?'
# As we can see, it followed a proper thug style 😎

http://www.lryc.cn/news/214038.html

相关文章:

  • maven之父子工程版本控制案例实战,及拓展groupId和artifactId的含义
  • 100量子比特启动实用化算力标准!玻色量子重磅发布相干光量子计算机
  • JAVA基础(JAVA SE)学习笔记(十)多线程
  • ChatGPT参数只有200亿?扩散代码模型,意外泄露
  • VR虚拟仿真教学在建筑学课堂中的应用
  • 竞赛 深度学习实现行人重识别 - python opencv yolo Reid
  • 当代都市的时尚先锋:气膜建筑的魅力
  • 品牌加盟商做信息展示预约小程序的效果如何
  • delphi 11.3 FastReport 多设备跨平台 打印之解决方法
  • 配置vue 环境
  • Visio文件编辑查看工具Visio Viewer for Mac
  • 现在软文发布平台都有哪些?如何在正规媒体发稿?
  • 【卷积神经网络】YOLO 算法原理
  • 云计算与ai人工智能对高防cdn的发展
  • Web3时代:探索DAO的未来之路
  • odbcinst文件
  • (CQUPT 的某数据结构homework)
  • Android页面周期、页面跳转
  • 腾讯云轻量应用镜像、系统镜像、Docker基础镜像、自定义镜像和共享镜像介绍
  • YOLOv8芒果独家首发 | 改进新主干:改进版目标检测新范式骨干PPHGNetv2,百度出品,提升YOLOv8检测能力
  • 工作测试点
  • 智慧医院—互联网医院系统带你体验数字化时代
  • eclipse Occurrence
  • 浏览器自动化脚本 Selenium WebDriver(Java)常用 API 汇总
  • 学习笔记|两独立样本秩和检验|曼-惠特尼 U数据分布图|规范表达|《小白爱上SPSS》课程:SPSS第十二讲 | 两独立样本秩和检验如何做?
  • 【Python微信机器人】第三篇:使用ctypes调用进程函数和读取内存结构体
  • easyExcel按模板填充数据,处理模板列表合并问题等,并导出为html,pdf,png等格式文件demo
  • 怎么开发小程序?微信小程序开发方式
  • 测试从外包到自研再到大厂,这5年鬼知道我是怎么过来的
  • Stable Diffusion系列(二):ControlNet基础控件介绍