Remove image if page.get_image_info() returns multiple images at its location #3631
Unanswered
paulgekeler
asked this question in
Looking for help
Replies: 1 comment 1 reply
-
Without the PDF itself, there is no way to provide definitive advice, but you have a couple of options here:
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, thank you for maintaining this project. It really is exceptional.
I am trying to replace images in pdfs with different ones I have stored locally. Thanks to this Github discussion https://github.com/pymupdf/PyMuPDF/discussions/924#discussioncomment-7249686 I have no problem doing this as follows:
However, I have encountered a pdf page (see below) where
get_image_info()
returns multiple images for a single image on the pdf:I would like to replace only one of them, i.e. the actual one visible on the page. I am not familiar with how pdfs are assembled or if images can be composed of multiple parts. Maybe thats what I'm missing.
I have tried to synchronise the source by calling
page.clean_contents()
before, but that doesn't help.Is there a way to recognise if images returned by
get_image_info()
are actually within the same image? (I could do something tedious like checking if bounding boxes are close enough, but that seems prone to errors.) I know the returned images have different xrefs so maybe they are different.Thank you for some needed insight.
Beta Was this translation helpful? Give feedback.
All reactions