You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When calling get() with index out of page range RawImageOutput returns last page's image whereas TextOutput throws a IndexError.
Steps To Reproduce:-
d=x.Document("samples/simple1.pdf")
iout=x.RawImageOutput(d)
tout=x.TextOutput(d)
print(len(d))
print(iout.get(10)) # will return same as iout.get(0)print(tout.get(10)) # will throw Index Error
Output:-
1
<PIL.Image.Image image mode=RGB size=1275x1651 at 0x7F179F573370>
Traceback (most recent call last):
File "_test.py", line 15, in<module>
tout.get(10)
File "src/pyxpdf/textoutput.pxi", line 268, in pyxpdf.xpdf.TextOutput.get
cpdef object get(self, int page_no):
File "src/pyxpdf/textoutput.pxi", line 286, in pyxpdf.xpdf.TextOutput.get
return self._get_bytes(page_no).decode('UTF-8', errors='ignore')
File "src/pyxpdf/textoutput.pxi", line 209, in pyxpdf.xpdf.TextOutput._get_bytes
if self._cache_texts[page_no] == None:
IndexError: list index out of range
The text was updated successfully, but these errors were encountered:
When calling
get()
with index out of page rangeRawImageOutput
returns last page's image whereasTextOutput
throws aIndexError
.Steps To Reproduce:-
Output:-
The text was updated successfully, but these errors were encountered: