发布于 1年前

Python 3用PyPDF2给PDF添加文字(中文乱码处理)

有时我们需要向指定的pdf添加一些文字,在python 3.x 提供了PyPDF2和io.BytesIO,我们可以使用它们来完成次任务。

环境准备

1、安装PyPDF2

pip install PyPDF2

2、安装和reportlab

pip install reportlab 

Python3.x 示例

from PyPDF2 import PdfFileWriter, PdfFileReader
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
import sys

# 屏蔽警告
if not sys.warnoptions:
    import warnings
    warnings.simplefilter("ignore")

packet = io.BytesIO()
# 使用Reportlab创建一个新的PDF
can = canvas.Canvas(packet, pagesize=letter)
can.drawString(10, 100, "Hello world")
can.save()
# buffer从偏移0开始
packet.seek(0)
new_pdf = PdfFileReader(packet)
# 读取已有的PDF
existing_pdf = PdfFileReader(open("original.pdf", "rb"))
output = PdfFileWriter()
# 
page = existing_pdf.getPage(0)
page.mergePage(new_pdf.getPage(0))
output.addPage(page)
# 最后,向目标的pdf写出
outputStream = open("destination.pdf", "wb")
output.write(outputStream)
outputStream.close()

Python2.7示例

Python2.7可以使用pyPdf和StringIO来向指定的pdf添加一些文字。

from pyPdf import PdfFileWriter, PdfFileReader
import StringIO
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
packet = StringIO.StringIO()
# 使用Reportlab创建新的PDF
can = canvas.Canvas(packet, pagesize=letter)
can.drawString(10, 100, "Hello world")
can.save()
# StringIO buffer从偏移0开始
packet.seek(0)
new_pdf = PdfFileReader(packet)
# 读取已有的PDF
existing_pdf = PdfFileReader(file("original.pdf", "rb"))
output = PdfFileWriter()
# 
page = existing_pdf.getPage(0)
page.mergePage(new_pdf.getPage(0))
output.addPage(page)
# 最后,向目的pdf写出
outputStream = file("destination.pdf", "wb")
output.write(outputStream)
outputStream.close()

中文乱码

reporterlab默认是不支持中文的,如果想使用中文,需要自己安装注册中文字体。

例如SimSun.ttf是中文字体。把它下载然后放到site-packages/reportlab/fonts目录下,

from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont

# 注册中文字体
pdfmetrics.registerFont(TTFont('SimSun', "SimSun.ttf"))

canvas调用设置字体:

canvas.setFont("SimSun",16)

Python3.x中文示例

# -*- coding:utf-8*-
from PyPDF2 import PdfFileWriter, PdfFileReader
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
import sys

# 屏蔽警告
if not sys.warnoptions:
    import warnings
    warnings.simplefilter("ignore")

# 注册中文字体
pdfmetrics.registerFont(TTFont('SimSun', "SimSun.ttf"))

packet = io.BytesIO()
# 使用Reportlab创建一个新的PDF
can = canvas.Canvas(packet, pagesize=letter)
can.setFont("SimSun",14)
can.drawString(10, 100, "你好")
can.save()
# buffer从偏移0开始
packet.seek(0)
new_pdf = PdfFileReader(packet)
# 读取已有的PDF
existing_pdf = PdfFileReader(open("original.pdf", "rb"))
output = PdfFileWriter()
#
page = existing_pdf.getPage(0)
page.mergePage(new_pdf.getPage(0))
output.addPage(page)
# 最后,向目标的pdf写出
outputStream = open("destination.pdf", "wb")
output.write(outputStream)
outputStream.close()
©2020 edoou.com   京ICP备16001874号-3