pdf citation #3536
Unanswered
Surajlambor
asked this question in
Q&A
pdf citation
#3536
Replies: 2 comments 6 replies
-
These data can be achieved like so:
|
Beta Was this translation helpful? Give feedback.
3 replies
-
How about using package pymupdf4llm instead? It produces markdown text, can read multi-column pages and tables. import pymupdf4llm
data = pymupdf4llm.to_markdown("input.pdf", page_chunks=True) Here, |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I love using this library for PDF purposes. Now, I'm building a PDF summarization tool. For citations, we need the PDF name and the page number of the retrieved answer. Do you have any ideas on how I can accomplish this? Please help with this
Beta Was this translation helpful? Give feedback.
All reactions