Changeset 728

Show
Ignore:
Timestamp:
02/02/06 11:28:22 (3 years ago)
Author:
mfenniak
Message:

off-by-one workaround for endstream identifier

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • pypdf/trunk/pyPdf/generic.py

    r727 r728  
    351351                stream.seek(t, 0) 
    352352            data["__streamdata__"] = stream.read(length) 
    353             # (sigh) - the odd PDF file has a length that is too long, so we'd 
    354             # need to read backwards to find the "endstream" ending.  Really, 
    355             # who cares - I am sure this code works properly given a correct 
    356             # PDF file, so I'm removing this assertion.  It's not necessary to 
    357             # read to the end of the object because streams are always in 
    358             # indirect objects - there's never an object after this one. 
    359             #e = readNonWhitespace(stream) 
    360             #ndstream = stream.read(8) 
    361             #assert e == "e" and ndstream == "ndstream" 
     353            e = readNonWhitespace(stream) 
     354            ndstream = stream.read(8) 
     355            if (e + ndstream) != "endstream": 
     356                # (sigh) - the odd PDF file has a length that is too long, so 
     357                # we need to read backwards to find the "endstream" ending. 
     358                # ReportLab (unknown version) generates files with this bug, 
     359                # and Python users into PDF files tend to be our audience. 
     360                # we need to do this to correct the streamdata and chop off 
     361                # an extra character. 
     362                pos = stream.tell() 
     363                stream.seek(-10, 1) 
     364                end = stream.read(9) 
     365                if end == "endstream": 
     366                    # we found it by looking back one character further. 
     367                    data["__streamdata__"] = data["__streamdata__"][:-1] 
     368                else: 
     369                    stream.seek(pos, 0) 
     370                    raise "Unable to find 'endstream' marker after stream." 
    362371        else: 
    363372            stream.seek(pos, 0)