当前位置：首页 > news >正文

本地运行LLama 3.2的三种方法

news 2025/8/25 11:56:13

在这里插入图片描述
大型语言模型（LLMs）已经彻底改变了AI领域，小型模型也在崛起。因此，即使是在旧的PC和智能手机上运行先进的LLMs也成为了可能。为了给大家一个起点，我们将探索三种不同的方法来本地与LLama 3.2进行交互。

先决条件

在这里插入图片描述

在我们深入探讨之前，请确保你已经：

安装并运行了Ollama
已经拉取了LLama 3.2模型（在终端中使用 ollama pull llama3.2）

现在，让我们来探索这三种方法！

Ollama的Python包提供了一种简便的方法，可以在你的Python脚本或Jupyter笔记本中与LLama 3.2进行交互。

import ollamaresponse = ollama.chat(model="llama3.2",messages=[{"role": "user","content": "Tell me an interesting fact about elephants",},],
)
print(response["message"]["content"])

这种方法非常适合简单的同步交互。但如果你想要流式接收响应呢？Ollama为你提供了AsyncClient：

import asyncio
from ollama import AsyncClientasync def chat():message = {"role": "user","content": "Tell me an interesting fact about elephants"}async for part in await AsyncClient().chat(model="llama3.2", messages=[message], stream=True):print(part["message"]["content"], end="", flush=True)# Run the async function
asyncio.run(chat())

方法二：使用Ollama API

对于那些更喜欢直接使用API或想要将LLama 3.2集成到非Python应用程序中的人，Ollama提供了一个简单的HTTP API。

curl http://localhost:11434/api/chat -d '{"model": "llama3.2","messages": [{"role": "user","content": "What are God Particles?"}],"stream": false
}'

这种方法为你提供了从任何能够发出HTTP请求的语言或工具与LLama 3.2进行交互的灵活性。

方法三：使用Langchain构建高级应用程序

对于更复杂的应用程序，特别是涉及文档分析和检索的应用程序，Langchain与Ollama和LLama 3.2可以无缝集成。

以下代码片段展示了加载文档、创建嵌入和执行相似性搜索的过程：

from langchain_community.document_loaders import DirectoryLoader, UnstructuredWordDocumentLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.llms import Ollama
from langchain_community.vectorstores import Chroma# 加载文档
loader = DirectoryLoader('/path/to/documents', glob="**/*.docx", loader_cls=UnstructuredWordDocumentLoader)
documents = loader.load()# 将文档拆分为多个块
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits = text_splitter.split_documents(documents)# 创建嵌入和向量存储
embeddings = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)# 初始化LLama 3.2
llm = Ollama(model="llama3.2", base_url="http://localhost:11434")# 执行相似性搜索并生成回答
query = "What was the main accomplishment of Thomas Jefferson?"
similar_docs = vectorstore.similarity_search(query)
context = "\n".join([doc.page_content for doc in similar_docs])
response = llm(f"上下文: {context}\n问题: {query}\n回答:")
print(response)

这种方法允许你构建能够使用LLama 3.2强大的语言理解能力来理解和推理大量文本数据的应用程序。

开始生成提示了！

下载成功后，你可以在终端中使用以下命令运行模型：

$ ollama run llama3.2:3b-instruct-q8_0

现在你已经可以生成一些图片提示了。假设我想为我的博客文章创建一个缩略图，于是我运行了下面的提示词：

Generate a random image prompt that I can use as a thumbnail for my article.

这是生成的结果：

Here's a random image prompt for you:"A misty, moonlit forest with an old, gnarled tree in the center, its branches twisted and tangled like a giant's fingers. In the background, a subtle glow
emanates from a faint, ethereal light that seems to be seeping from the very earth itself."Feel free to use this prompt as is or modify it to fit your article's theme and style!

在这里插入图片描述