Overlapping text #3564
Unanswered
sttpvvoyer
asked this question in
Looking for help
Replies: 1 comment 2 replies
-
The table creator did not care about confining table cells content inside the designated table cell. So all I can think of is doing a text extraction by words |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I'm struggling to determine the best approach. Any help is appreciated. I got some overlapping text. See column Comments that could spawn in the background theorically over all columns depending of it's variable length.
I don't care about losing data out of Comments. But i need to make sure other columns data is valid.
How can i do that using get_textbox?
Here what i have so far:
civic_data = {'MYKEY' : mykey, 'CIVIC': page.get_textbox(fitz.Rect(posCivic.x0, firstElemHeight, posPdrType.x0, nextElemHeight-1)) ,'COMMENTS' : page.get_textbox(fitz.Rect(posCommentsPoc.x0-5, firstElemHeight,posCivic.x0-2, nextElemHeight-1)) ,'PDRTYPE' : page.get_textbox(fitz.Rect(posPdrType.x0, firstElemHeight,posAm.x0-5, nextElemHeight-1)) ,'AM' : page.get_textbox(fitz.Rect(posAm.x0-5, firstElemHeight,posNWPV.x0-5, nextElemHeight-1)) ,'LNM_V' : page.get_textbox(fitz.Rect(posNWPV.x0-5, firstElemHeight,posCc.x0-5, nextElemHeight-1)) ,'CC' : page.get_textbox(fitz.Rect(posCc.x0+5, firstElemHeight,posMMV.x0-5, nextElemHeight-1))}
For example the line with Civic 7125-6 is returning me the following with a break line between "HE ARTS" and "7125-6":
How can i have text_box only returning "7125-6"?
Beta Was this translation helpful? Give feedback.
All reactions