Feature upgrade apache pdfbox #4450
Open
+1,379
−170
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First draft for #4449
It switches to PDFBox 3.0.3 and falls back - in case of errors - to PDFBox 1.8.
What I am still thinking about: How to learn if and where there are differences between the PDFBox versions.
Right now, I am printing a log message. But users will most likely not notice it let alone inform us.
I wonder if we should create both versions of the extracted text, and, if there are differences, put both files into the debug text. At least we see documents that are different.