The extracted text get reversed after is interacted with rect #3590
-
I tried two ways to extract a single piece of text from a pdf page with tables: 1)removing the boundaryboxes of the tables and then using the apply_redactions() method and then get_text() to get the text. If instead I use get_text() without apply_redaction() the text is returned in the right order, but includes unwanted tab text. Do I have to reorder the text myself with splitlines() and reverse()? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Why don't you simply extract via |
Beta Was this translation helpful? Give feedback.
-
I always used page.get_text(clip=rect, sort=True) and it worked as intended! |
Beta Was this translation helpful? Give feedback.
Why don't you simply extract via
page.get_text(sort=True)
?