A class representing a destination within a PDF file.
For more information about this class, see The Destination Class.
A class representing the basic document metadata provided in a PDF File.
For more information about this class, see The DocumentInformation Class.
This class represents a single page within a PDF file.
For more information about this class, see The PageObject Class.
Initializes a PdfFileReader object.
For more information about this class, see The PdfFileReader Class.
This class supports writing PDF files out, given pages produced by another class (typically {@link #PdfFileReader PdfFileReader}).
For more information about this class, see The PdfFileWriter Class.
A class representing a destination within a PDF file. See section 8.2.1 of the PDF 1.6 reference. Stability: Added in v1.10, will exist for all v1.x releases.
Read-only property accessing the bottom vertical coordinate.
Read-only property accessing the left horizontal coordinate.
Read-only property accessing the destination page.
Read-only property accessing the right horizontal coordinate.
Read-only property accessing the destination title.
Read-only property accessing the top vertical coordinate.
Read-only property accessing the destination type.
Read-only property accessing the zoom factor.
A class representing the basic document metadata provided in a PDF File.
As of pyPdf v1.10, all text properties of the document metadata have two properties, eg. author and author_raw. The non-raw property will always return a TextStringObject, making it ideal for a case where the metadata is being displayed. The raw property can sometimes return a ByteStringObject, if pyPdf was unable to decode the string's text encoding; this requires additional safety in the caller and therefore is not as commonly accessed.
Read-only property accessing the document's author. Added in v1.6, will exist for all future v1.x releases. Modified in v1.10 to always return a unicode string (TextStringObject).
Read-only property accessing the document's creator. If the document was converted to PDF from another format, the name of the application (for example, OpenOffice) that created the original document from which it was converted. Added in v1.6, will exist for all future v1.x releases. Modified in v1.10 to always return a unicode string (TextStringObject).
Read-only property accessing the document's producer. If the document was converted to PDF from another format, the name of the application (for example, OSX Quartz) that converted it to PDF. Added in v1.6, will exist for all future v1.x releases. Modified in v1.10 to always return a unicode string (TextStringObject).
Read-only property accessing the subject of the document. Added in v1.6, will exist for all future v1.x releases. Modified in v1.10 to always return a unicode string (TextStringObject).
Read-only property accessing the document's title. Added in v1.6, will exist for all future v1.x releases. Modified in v1.10 to always return a unicode string (TextStringObject).
This class represents a single page within a PDF file. Typically this object will be created by accessing the {@link #PdfFileReader.getPage getPage} function of the {@link #PdfFileReader PdfFileReader} class.
A rectangle (RectangleObject), expressed in default user space units, defining the extent of the page's meaningful content as intended by the page's creator.
Stability: Added in v1.4, will exist for all future v1.x releases.
A rectangle (RectangleObject), expressed in default user space units, defining the region to which the contents of the page should be clipped when output in a production enviroment.
Stability: Added in v1.4, will exist for all future v1.x releases.
Compresses the size of this page by joining all content streams and applying a FlateDecode filter.
Stability: Added in v1.6, will exist for all future v1.x releases. However, it is possible that this function will perform no action if content stream compression becomes "automatic" for some reason.
A rectangle (RectangleObject), expressed in default user space units, defining the visible region of default user space. When the page is displayed or printed, its contents are to be clipped (cropped) to this rectangle and then imposed on the output medium in some implementation-defined manner. Default value: same as MediaBox.
Stability: Added in v1.4, will exist for all future v1.x releases.
Locate all text drawing commands, in the order they are provided in the content stream, and extract the text. This works well for some PDF files, but poorly for others, depending on the generator used. This will be refined in the future. Do not rely on the order of text coming out of this function, as it will change if this function is made more sophisticated.
Stability: Added in v1.7, will exist for all future v1.x releases. May be overhauled to provide more ordered text in the future.
A rectangle (RectangleObject), expressed in default user space units, defining the boundaries of the physical medium on which the page is intended to be displayed or printed.
Stability: Added in v1.4, will exist for all future v1.x releases.
Merges the content streams of two pages into one. Resource references (i.e. fonts) are maintained from both pages. The mediabox/cropbox/etc of this page are not altered. The parameter page's content stream will be added to the end of this page's content stream, meaning that it will be drawn after, or "on top" of this page.
Stability: Added in v1.4, will exist for all future 1.x releases.
Rotates a page clockwise by increments of 90 degrees.
Stability: Added in v1.1, will exist for all future v1.x releases.
Rotates a page counter-clockwise by increments of 90 degrees.
Stability: Added in v1.1, will exist for all future v1.x releases.
A rectangle (RectangleObject), expressed in default user space units, defining the intended dimensions of the finished page after trimming.
Stability: Added in v1.4, will exist for all future v1.x releases.
Initializes a PdfFileReader object. This operation can take some time, as the PDF stream's cross-reference tables are read into memory.
Stability: Added in v1.0, will exist for all v1.x releases.
When using an encrypted / secured PDF file with the PDF Standard encryption handler, this function will allow the file to be decrypted. It checks the given password against the document's user password and owner password, and then stores the resulting decryption key if either password is correct.
It does not matter which password was matched. Both passwords provide the correct decryption key that will allow the document to be used with this library.
Stability: Added in v1.8, will exist for all future v1.x releases.
Read-only property that accesses the {@link #PdfFileReader.getDocumentInfo getDocumentInfo} function.
Stability: Added in v1.7, will exist for all future v1.x releases.
Retrieves the PDF file's document information dictionary, if it exists. Note that some PDF files use metadata streams instead of docinfo dictionaries, and these metadata streams will not be accessed by this function.
Stability: Added in v1.6, will exist for all future v1.x releases.
Retrieves the named destinations present in the document.
Stability: Added in v1.10, will exist for all future v1.x releases.
Calculates the number of pages in this PDF file.
Stability: Added in v1.0, will exist for all v1.x releases.
Retrieves the document outline present in the document.
Stability: Added in v1.10, will exist for all future v1.x releases.
Retrieves a page by number from this PDF file.
Stability: Added in v1.0, will exist for all v1.x releases.
Read-only boolean property showing whether this PDF file is encrypted. Note that this property, if true, will remain true even after the {@link #PdfFileReader.decrypt decrypt} function is called.
Read-only property that accesses the {@link #PdfFileReader.getNamedDestinations getNamedDestinations} function.
Stability: Added in v1.10, will exist for all future v1.x releases.
Read-only property that accesses the {@link #PdfFileReader.getNumPages getNumPages} function.
Stability: Added in v1.7, will exist for all future v1.x releases.
Read-only property that accesses the {@link #PdfFileReader.getOutlines getOutlines} function.
Stability: Added in v1.10, will exist for all future v1.x releases.
Read-only property that emulates a list based upon the {@link #PdfFileReader.getNumPages getNumPages} and {@link #PdfFileReader.getPage getPage} functions.
Stability: Added in v1.7, and will exist for all future v1.x releases.
This class supports writing PDF files out, given pages produced by another class (typically {@link #PdfFileReader PdfFileReader}).
Adds a page to this PDF file. The page is usually acquired from a {@link #PdfFileReader PdfFileReader} instance.
Stability: Added in v1.0, will exist for all v1.x releases.
Encrypt this PDF file with the PDF Standard encryption handler.
Writes the collection of pages added to this object out as a PDF file.
Stability: Added in v1.0, will exist for all v1.x releases.