From John L. Scarfone <j0@cox.net>:
In using pyPdf1.9 I noticed an issue with rendering a PDF in Acroread 7.
I didn't notice the issue in xpdf but Acroread is less forgiving of
errors. I traced the error to the fact that in at least one place pyPdf
assumes there's always a LF after a CR. Below is a diff that works for
me. I haven't investigated any other places in the code where this
assumption is made. The PDF I have was originally from some HP scanner
and uses bare CRs as line termination. Just reading it in and writing
it out causes the first character of the streams to get chopped.
Patch:
--- pyPdf-1.9/pyPdf/generic.py Fri Dec 15 16:58:08 2006
+++ /home/johns/downloads/pyPdf-1.9/pyPdf/generic.py Wed Feb 7 20:34:22 2007
@@ -358,7 +358,9 @@
assert eol in ("\n", "\r")
if eol == "\r":
# read \n after
- stream.read(1)
+ peek = stream.read(1)
+ if peek != "\n":
+ stream.seek(-1, 1)
# this is a stream object, not a dictionary
assert data.has_key("/Length")
length = data["/Length"]