当前位置: 首页 > news >正文

Python实现Word、Excel、PPT批量转为PDF

今天看见了一个有意思的脚本Python批量实现Word、EXCLE、PPT转PDF文件。

因为我平时word用的比较的多,所以深有体会,具体怎么实现的我们就不讨论了,因为这个去学了也没什么提升,不然也不会当作脚本了。这里我将其放入了pyzjr库中,也方便大家进行调用。

你可以去下载pyzjr:

pip install pyzjr -i https://pypi.tuna.tsinghua.edu.cn/simple

调用方法:

import pyzjr as pz# 实例化对象
Mpdf = pz.Microsoft2PDF()
# 调用类的方法
Mpdf.Word2Pdf()  # word -> pdf
Mpdf.Excel2Pdf()  # excel -> pdf
Mpdf.PPt2Pdf()  # ppt -> pdf
Mpdf.WEP2Pdf()  # word,excel,ppt -> pdf

上面就是api的调用了,统一会将文件存放在目标文件夹下新建的名为pdf文件夹中。

pyzjr中的源码:

import win32com.client, gc, osclass Microsoft2PDF():"""Convert Microsoft Office documents (Word, Excel, PowerPoint) to PDF format"""def __init__(self,filePath = ""):""":param filePath: 如果默认是空字符,就默认当前路径"""self.flagW = self.flagE = self.flagP = 1self.words = []self.ppts = []self.excels = []if filePath == "":filePath = os.getcwd()folder = filePath + '\\pdf\\'self.folder = CreateFolder(folder,debug=False)self.filePath = filePathfor i in os.listdir(self.filePath):if i.endswith(('.doc', 'docx')):self.words.append(i)if i.endswith(('.ppt', 'pptx')):self.ppts.append(i)if i.endswith(('.xls', 'xlsx')):self.excels.append(i)if len(self.words) < 1:print("\n[pyzjr]:No Word files\n")self.flagW = 0if len(self.ppts) < 1:print("\n[pyzjr]:No PPT file\n")self.flagE = 0if len(self.excels) < 1:print("\n[pyzjr]:No Excel file\n")self.flagP = 0def Word2Pdf(self):if self.flagW == 0:return 0else:print("\n[Start Word ->PDF conversion]")try:print("Open Word Process...")word = win32com.client.Dispatch("Word.Application")word.Visible = 0word.DisplayAlerts = Falsedoc = Nonefor i in range(len(self.words)):print(i)fileName = self.words[i]  # file namefromFile = os.path.join(self.filePath, fileName)  # file addresstoFileName = self.changeSufix2Pdf(fileName)  # Generated file nametoFile = self.toFileJoin(toFileName)  # Generated file addressprint("Conversion:" + fileName + "in files...")try:doc = word.Documents.Open(fromFile)doc.SaveAs(toFile, 17)print("Convert to:" + toFileName + "file completion")except Exception as e:print(e)print("All Word files have been printed")print("End Word Process...\n")doc.Close()doc = Noneword.Quit()word = Noneexcept Exception as e:print(e)finally:gc.collect()def Excel2Pdf(self):if self.flagE == 0:return 0else:print("\n[Start Excel -> PDF conversion]")try:print("open Excel Process...")excel = win32com.client.Dispatch("Excel.Application")excel.Visible = 0excel.DisplayAlerts = Falsewb = Nonews = Nonefor i in range(len(self.excels)):print(i)fileName = self.excels[i]fromFile = os.path.join(self.filePath, fileName)print("Conversion:" + fileName + "in files...")try:wb = excel.Workbooks.Open(fromFile)for j in range(wb.Worksheets.Count):  # Number of worksheets, one workbook may have multiple worksheetstoFileName = self.addWorksheetsOrder(fileName, j + 1)toFile = self.toFileJoin(toFileName)ws = wb.Worksheets(j + 1)ws.ExportAsFixedFormat(0, toFile)print("Convert to:" + toFileName + "file completion")except Exception as e:print(e)# 关闭 Excel 进程print("All Excel files have been printed")print("Ending Excel process...\n")ws = Nonewb.Close()wb = Noneexcel.Quit()excel = Noneexcept Exception as e:print(e)finally:gc.collect()def PPt2Pdf(self):if self.flagP == 0:return 0else:print("\n[Start PPT ->PDF conversion]")try:print("Opening PowerPoint process...")powerpoint = win32com.client.Dispatch("PowerPoint.Application")ppt = Nonefor i in range(len(self.ppts)):print(i)fileName = self.ppts[i]fromFile = os.path.join(self.filePath, fileName)toFileName = self.changeSufix2Pdf(fileName)toFile = self.toFileJoin(toFileName)print("Conversion:" + fileName + "in files...")try:ppt = powerpoint.Presentations.Open(fromFile, WithWindow=False)if ppt.Slides.Count > 0:ppt.SaveAs(toFile, 32)print("Convert to:" + toFileName + "file completion")else:print("Error, unexpected: This file is empty, skipping this file")except Exception as e:print(e)print("All PPT files have been printed")print("Ending PowerPoint process...\n")ppt.Close()ppt = Nonepowerpoint.Quit()powerpoint = Noneexcept Exception as e:print(e)finally:gc.collect()def WEP2Pdf(self):"""Word, Excel and PPt are all converted to PDF.If there are many files, it may take some time"""print("Convert Microsoft Three Musketeers to PDF")self.Word2Pdf()self.Excel2Pdf()self.PPt2Pdf()print(f"All files have been converted, you can find them in the {self.folder}")def changeSufix2Pdf(self,file):"""将文件后缀更改为.pdf"""return file[:file.rfind('.')] + ".pdf"def addWorksheetsOrder(self,file, i):"""在文件名中添加工作表顺序"""return file[:file.rfind('.')] + "_worksheet" + str(i) + ".pdf"def toFileJoin(self, file):"""将文件路径和文件名连接为完整的文件路径"""return os.path.join(self.filePath, 'pdf', file[:file.rfind('.')] + ".pdf")

 这里我对原先博主的代码进行了一定的优化,使其可供我们调用。

这是控制台打印出来的信息,我们可以发现在调用WEP2Pdf时,如果当前文件夹中没有word的文件也能继续去转换。 

http://www.lryc.cn/news/154356.html

相关文章:

  • LLM大模型推理加速 vLLM
  • Python|小游戏之猫捉老鼠!!!
  • 万里路,咫尺间:汽车与芯片的智能之遇
  • Ubuntu22.04.1上 mosquitto安装及mosquitto-auth-plug 认证插件配置
  • CCKS2023:基于企业数仓和大语言模型构建面向场景的智能应用
  • LeetCode 热题 100——无重复字符的最长子串(滑动窗口)
  • 【zookeeper】zookeeper的shell操作
  • R语言Meta分析核心技术
  • Oracle数据库尚硅谷学习笔记
  • CG MAGIC进行实体渲染后!分析渲染器CR和VR的区别之处!
  • Ubuntu下Python3与Python2相互切换
  • 【深度学习】实验07 使用TensorFlow完成逻辑回归
  • 2023-09-04 Linux 让shell编译脚本里面设置的环境变量改变kernel里面驱动文件的宏定义值方法,我这里用来做修改固件版本
  • Python操作Excel实战:Excel行转列
  • java实现迭代器模式
  • C++day7模板、异常、auto关键字、lambda表达式、数据类型转换、STL、list、文件操作
  • 【校招VIP】产品分析之活动策划宣传
  • node基础之一:fs 模块
  • 如何快速搭建母婴行业的微信小程序?
  • 【科普向】Jmeter 如何测试接口保姆式教程
  • 阿里云2核4G服务器5M带宽5年费用价格明细表
  • 【图解RabbitMQ-2】图解JMS规范与AMQP协议是什么
  • springboot整合mybatis实现增删改查(xml)--项目阶段1
  • springboot文件上传异步报错
  • error: unable to unlink old ‘.gitlab-ci.yml‘: Permission denied
  • AJAX学习笔记3练习
  • springboot实战(五)之sql业务日志输出,重要
  • redis7.2.0 centos源码编译安装并设置开机自启动
  • 网易低代码引擎Tango正式开源
  • Apache Linkis 与 OceanBase 集成:实现数据分析速度提升