pyPdf

Out of date!



This page is no longer updated. I've stopped maintaining pyPdf, and a company named Phaseit has forked the project and continued development and maintenance with my blessing as pyPdf2 ( http://knowah.github.com/PyPDF2/).



About

A Pure-Python library built as a PDF toolkit. It is capable of:

By being Pure-Python, it should run on any Python platform without any dependencies on external libraries. It can also work entirely on StringIO objects rather than file streams, allowing for PDF manipulation in memory. It is therefore a useful tool for websites that manage or manipulate PDFs.

Download Latest

The latest release of pyPdf is version 1.13, released on December 4, 2010. All releases of pyPdf are distributed under the terms of a modified BSD license.

Documentation

Documentation of the pyPdf module is available online. This documentation is produced by PythonDoc, and as a result can also be seen integrated with the source code.

Source Code Repository

pyPdf is distributed under the terms of a modified BSD license. The complete source code and history is available through a git repository for anyone who is interested, at http://github.com/mfenniak/pyPdf/tree/trunk. There is also a Python 3.0 compatible branch available at http://github.com/mfenniak/pyPdf/tree/py3.

Example

from pyPdf import PdfFileWriter, PdfFileReader

output = PdfFileWriter()
input1 = PdfFileReader(file("document1.pdf", "rb"))

# print the title of document1.pdf
print "title = %s" % (input1.getDocumentInfo().title)

# add page 1 from input1 to output document, unchanged
output.addPage(input1.getPage(0))

# add page 2 from input1, but rotated clockwise 90 degrees
output.addPage(input1.getPage(1).rotateClockwise(90))

# add page 3 from input1, rotated the other way:
output.addPage(input1.getPage(2).rotateCounterClockwise(90))
# alt: output.addPage(input1.getPage(2).rotateClockwise(270))

# add page 4 from input1, but first add a watermark from another pdf:
page4 = input1.getPage(3)
watermark = PdfFileReader(file("watermark.pdf", "rb"))
page4.mergePage(watermark.getPage(0))

# add page 5 from input1, but crop it to half size:
page5 = input1.getPage(4)
page5.mediaBox.upperRight = (
    page5.mediaBox.getUpperRight_x() / 2,
    page5.mediaBox.getUpperRight_y() / 2
)
output.addPage(page5)

# print how many pages input1 has:
print "document1.pdf has %s pages." % input1.getNumPages()

# finally, write "output" to document-output.pdf
outputStream = file("document-output.pdf", "wb")
output.write(outputStream)
outputStream.close()

Changelog

Version 1.12, 2008-09-02

Version 1.11, 2008-05-09

Version 1.10, 2007-10-04

Version 1.9, 2006-12-15

Version 1.8, 2006-12-14

Version 1.7, 2006-12-10

Version 1.6, 2006-06-06

Version 1.5, 2006-01-28

Version 1.4, 2006-01-27

Version 1.3, 2006-01-23

Version 1.2, 2006-01-23

Version 1.1, 2006-01-18

Version 1.0, 2006-01-17