PDF Parallel Processing with Python API / 基于Python API的PDF并行处理代码分享 #1375
relic-yuexi
started this conversation in
Show and tell
Replies: 1 comment 2 replies
-
这个太棒了,我看了一下是直接用python mp实现的,能不能对比一下跟litserver相比有什么优劣势吗?你的logging里面有没有error或者warning能发出来对比一下? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
PDF Parallel Processing with Python API / 基于Python API的PDF并行处理工具分享
中文说明
概述
我最近开发了一个基于 MinerU 库的Python脚本,用于并行处理PDF文件并将其转换为Markdown格式。这个工具特别适合需要处理大量PDF文件的场景,并且支持多GPU加速,能够显著提高处理效率。
主要功能
如何使用
示例
未来计划
如果你有任何问题或建议,欢迎在评论区留言!希望这个工具能帮助到你!
English Description
Overview
I recently developed a Python script based on the MinerU library for parallel processing of PDF files and converting them to Markdown format. This tool is particularly suitable for scenarios where a large number of PDF files need to be processed, and it supports multi-GPU acceleration, significantly improving processing efficiency.
Key Features
How to Use
Example
Future Plans
If you have any questions or suggestions, feel free to leave a comment! Hope this tool helps you!
Beta Was this translation helpful? Give feedback.
All reactions