-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation Fault when using Document.text()
between RawImageOutput.get()
calls
#9
Comments
GDB Stack trace : Program received signal SIGSEGV, Segmentation fault.
0x00007ffff75817f2 in XRef::fetch (this=this@entry=0x20, num=7, gen=0, obj=obj@entry=0x7fffffffad30, recursion=recursion@entry=0)
at /home/oreki/Projects/pyxpdf/build/tmp/libxpdf/xpdf-4.02/xpdf/XRef.cc:1039
1039 if (num < 0 || num >= size) { Backtrace:- #0 0x00007ffff75817f2 in XRef::fetch (this=this@entry=0x20, num=7, gen=0, obj=obj@entry=0x7fffffffad30, recursion=recursion@entry=0)
at /home/oreki/Projects/pyxpdf/build/tmp/libxpdf/xpdf-4.02/xpdf/XRef.cc:1039
#1 0x00007ffff75655bc in Object::fetch (this=this@entry=0x7fffffffad20, xref=xref@entry=0x20, obj=obj@entry=0x7fffffffad30, recursion=recursion@entry=0)
at /home/oreki/Projects/pyxpdf/build/tmp/libxpdf/xpdf-4.02/xpdf/Object.cc:114
#2 0x00007ffff754a7c8 in GfxFontDict::GfxFontDict (this=0x555555b18340, xref=0x20, fontDictRef=0x0, fontDict=0x555555b16f80)
at /home/oreki/Projects/pyxpdf/build/tmp/libxpdf/xpdf-4.02/xpdf/GfxFont.cc:2115
#3 0x00007ffff7536253 in GfxResources::GfxResources (this=0x555555b1c630, xref=0x20, resDict=0x555555b16d80, nextA=0x0)
at /home/oreki/Projects/pyxpdf/build/tmp/libxpdf/xpdf-4.02/xpdf/Object.h:161
#4 0x00007ffff7536a3b in Gfx::Gfx (this=0x555555a60230, docA=<optimized out>, outA=0x555555a9cc70, pageNum=1, resDict=0x555555b16d80, hDPI=150, vDPI=150,
box=0x7fffffffaea0, cropBox=0x0, rotate=0, abortCheckCbkA=0x0, abortCheckCbkDataA=0x0) at /home/oreki/Projects/pyxpdf/build/tmp/libxpdf/xpdf-4.02/xpdf/Gfx.cc:509
#5 0x00007ffff756869a in Page::displaySlice (this=0x555555b17290, out=0x555555a9cc70, hDPI=150, vDPI=150, rotate=0, useMediaBox=<optimized out>, crop=<optimized out>,
sliceX=<optimized out>, sliceY=0, sliceW=1275, sliceH=1651, printing=0, abortCheckCbk=0x0, abortCheckCbkData=0x0)
at /home/oreki/Projects/pyxpdf/build/tmp/libxpdf/xpdf-4.02/xpdf/Object.h:161
#6 0x00007ffff74b8de0 in __pyx_f_6pyxpdf_4xpdf_4Page_display_slice (__pyx_v_self=0x7ffff7757050, __pyx_v_out=0x555555a9cc70, __pyx_v_x1=0, __pyx_v_y1=0, __pyx_v_hgt=1275,
__pyx_v_wdt=1651, __pyx_optional_args=0x7fffffffb100) at src/pyxpdf/xpdf.cpp:28114
#7 0x00007ffff749429d in __pyx_f_6pyxpdf_4xpdf_14RawImageOutput__get_SplashBitmap (__pyx_v_self=0x7ffff6fd0ad0, __pyx_v_page_no=0, __pyx_v_x=0, __pyx_v_y=0,
__pyx_v_w=1275, __pyx_v_h=1651, __pyx_v_page_h=1650.0000000000002, __pyx_v_page_w=1275, __pyx_v_res_x=150, __pyx_v_res_y=150) at src/pyxpdf/xpdf.cpp:19167
#8 0x00007ffff7496671 in __pyx_f_6pyxpdf_4xpdf_14RawImageOutput__get_normalize_SplashBitmap (__pyx_v_self=0x7ffff6fd0ad0, __pyx_v_page_no=0, __pyx_v_crop_x=0,
__pyx_v_crop_y=0, __pyx_v_crop_h=0, __pyx_v_crop_w=0, __pyx_v_scale_x=0, __pyx_v_scale_y=0) at src/pyxpdf/xpdf.cpp:19650
#9 0x00007ffff7498258 in __pyx_f_6pyxpdf_4xpdf_14RawImageOutput_get (__pyx_v_self=0x7ffff6fd0ad0, __pyx_v_page_no=0, __pyx_skip_dispatch=1,
__pyx_optional_args=0x7fffffffb590) at src/pyxpdf/xpdf.cpp:19954
#10 0x00007ffff7498ed1 in __pyx_pf_6pyxpdf_4xpdf_14RawImageOutput_2get (__pyx_v_self=0x7ffff6fd0ad0, __pyx_v_page_no=0, __pyx_v_crop_box=(0, 0, 0, 0),
__pyx_v_scale_pixel_box=None) at src/pyxpdf/xpdf.cpp:20125
#11 0x00007ffff7498cd9 in __pyx_pw_6pyxpdf_4xpdf_14RawImageOutput_3get (__pyx_v_self=<pyxpdf.xpdf.RawImageOutput at remote 0x7ffff6fd0ad0>, __pyx_args=(0,), __pyx_kwds=0x0)
at src/pyxpdf/xpdf.cpp:20102
#12 0x00007ffff750c00c in __Pyx_CyFunction_CallMethod (func=<cython_function_or_method at remote 0x7ffff6cee2f0>,
self=<pyxpdf.xpdf.RawImageOutput at remote 0x7ffff6fd0ad0>, arg=(0,), kw=0x0) at src/pyxpdf/xpdf.cpp:47923
#13 0x00007ffff750c3d0 in __Pyx_CyFunction_CallAsMethod (func=<cython_function_or_method at remote 0x7ffff6cee2f0>,
args=(<pyxpdf.xpdf.RawImageOutput at remote 0x7ffff6fd0ad0>, 0), kw=0x0) at src/pyxpdf/xpdf.cpp:47986
#14 0x00005555555d9f77 in _PyObject_MakeTpCall (callable=<cython_function_or_method at remote 0x7ffff6cee2f0>, args=<optimized out>, nargs=<optimized out>, keywords=0x0)
at Objects/call.c:159
#15 0x00005555557b9a91 in _PyObject_Vectorcall (kwnames=0x0, nargsf=2, args=0x5555559dd630, callable=<cython_function_or_method at remote 0x7ffff6cee2f0>)
at ./Include/cpython/abstract.h:125
#16 method_vectorcall (method=<optimized out>, args=0x5555559dd638, nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:60
#17 0x00005555555c6044 in _PyObject_Vectorcall (kwnames=0x0, nargsf=9223372036854775809, args=0x5555559dd638, callable=<method at remote 0x7ffff78564d0>)
at ./Include/cpython/abstract.h:127
#18 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x555555975250) at Python/ceval.c:4963
#19 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3469
#20 0x00005555556b0581 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x5555559dd4c0, for file _test.py, line 275, in <module> ()) at Python/ceval.c:741
#21 _PyEval_EvalCodeWithName (_co=_co@entry=<code at remote 0x7ffff77b9790>,
globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='_test.py') at remote 0x7ffff7 |
Problem is in void PDFDoc::displayPages(OutputDev *out, int firstPage, int lastPage,
double hDPI, double vDPI, int rotate,
GBool useMediaBox, GBool crop, GBool printing,
GBool (*abortCheckCbk)(void *data),
void *abortCheckCbkData) {
int page;
for (page = firstPage; page <= lastPage; ++page) {
displayPage(out, page, hDPI, vDPI, rotate, useMediaBox, crop, printing,
abortCheckCbk, abortCheckCbkData);
catalog->doneWithPage(page); // ---------------> This unload the Pages
}
}
|
Observation: `text()` and `text_bytes()` methods use `Document.display_pages()` wrapper method which wrap the call to xpdf's `PDFDoc.displayPages()` cpp method which after running the loop for `displayPage()` unloads the internal xpdf's `Page` Class by calling `Catalog.doneWithPage()` cpp method. But our wrapper Extension class `Page` keeps pointing to old pointer of xpdf's `Page`. If you then do any operation involving them such as `displayPageSlice()` it causes SEGFAULT. Fix: Changed the `Document.display_pages()` to just do the same as `displayPages()` except unloading Pages.
Steps to Reproduce:-
_test.py
Output :-
System:-
The text was updated successfully, but these errors were encountered: