Skip to content

Check if a PDF page has only text or other additions too #3757

Closed Answered by JorjMcKie
vignesh0710 asked this question in Looking for help
Discussion options

You must be logged in to vote

There are a number of checks available:

  • page.get_text() in various variations. The "text" variant offers a quick check for any text at all or only white text.
  • page.first_annot, page.first_link, page.first_widget: if any of them is not None, then there are objects of the respective type.
  • page.get_drawings != [] means that there exist drawings / vector graphics
  • page.get_image_info() != [] there are images on page
  • page.find_tables().tables != [] there are tables on the page

Text columns on the page can not be directly determined.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by vignesh0710
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants