Pymupdf, reading in two different pdfs of the same book differently, why? #3851
Unanswered
embland
asked this question in
Looking for help
Replies: 1 comment 2 replies
-
Both PDFs are made from scanned copies of the original book which are OCRed in different ways and potentially with different post-processing. Also note the vastly different file sizes.
When (Py-) MuPDF determines a large horizontal distance between text particles, it generates line breaks - even when the bottom coordinates seem to suggest the same vertical position (which is only an approximate truth for OCR anyway). |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi! I'm reading in the Dungeons & Dragons Player's Handbook 5e for practice, and I found that two different pdfs read in differently. The table of contents on one reads in in order (and with the dots from the pdf) but with weird spaces added to words, and the other doesn't have weird spacing but is out of order (and without the dots). I've added the pdf file of the contents below, as well as what their respective outputs look like.
Why does this happen? Is it how the PDF is saved? Is it just the sharpness/higher contrast in the ones with dots? I don't know a lot about PDF formatting, and any help in understanding why this happens is really appreciated.
no_dots_contents.pdf
with_dots_contents.pdf
With Dots Contents.txt:
C o n t e n t s
P r e f a c e
4
I n t r o d u c t i o n
5
W orlds o f Adventure................................................................... 5
Using This B o o k ......................................................................... 6
How to Play...................................................................................6
Adventures.................................................................................... 7
P a r t 1
9
C h a p t e r 1: St e p - b y - S t e p C h a r a c t e r s ..... 11
Beyond 1st Level............................................................... 15
C h a p t e r 2: R a c e s ........................................................ 17
Choosing a Race................................................................ 17
D w arf.................................................................................... 18
E lf...........................................................................................21
Halfling.................................................................................26
Human..................................................................................29
D ragonborn........................................................................ 32
G nom e..................................................................................35
Half-Elf.................................................................................38
H alf-O rc.............................................................................. 40
Tiefling.................................................................................42
C h a p t e r 3: C l a s s e s ..................................................45
Barbarian............................................................................ 46
B a rd ...................................................................................... 51
Cleric.....................................................................................56
Druid.....................................................................................64
Fighter..................................................................................70
M onk.....................................................................................76
Paladin.................................................................................82
Ranger..................................................................................89
Rogue....................................................................................94
S orcerer.............................................................................. 99
W arlock..............................................................................105
W izard................................................................................112
C h a p t e r 4 : P e r s o n a l i t y a n d
B a c k g r o u n d .................................................................. 121
Character Details............................................................121
Inspiration........................................................................ 125
Backgrounds....................................................................125
C h a p t e r 5: E q u i p m e n t .........................................143
Starting Equipment....................................................... 143
W ealth................................................................................143
Arm or and Shields.........................................................144
W eapons............................................................................ 146
Adventuring G ear...........................................................148
Tools....................................................................................154
Mounts and Vehicles..................................................... 155
Trade G oods.....................................................................157
Expenses........................................................................... 157
Trinkets............................................................................. 159
C h a p t e r 6 : C u s t o m i z a t i o n O p t i o n s .... 163
Multiclassing....................................................................163
F eats...................................................................................165
P a r t 2
171
C h a p t e r 7: U s in g A b i l i t y S c o r e s ...........173
Ability S cores and M odifiers........................................173
Advantage and Disadvantage...................................... 173
Proficiency B onus............................................................173
Ability Checks...................................................................174
Using Each Ability...........................................................175
Saving T h row s................................................................179
C h a p t e r 8: A d v e n t u r i n g ......................181
T im e...........................................................................181
M ovem ent.......................................................................... 181
The Environment.......................... .........................183
Social Interaction........................................................... 185
R esting............................................................................... 186
Between Adventures......................................................186
C h a p t e r 9 : C o m b a t ................................................ 189
The Order of Com bat.....................................................189
Movement and Position.................................................190
Actions in C om bat.......................................................... 192
Making an Attack............................................................ 193
Cover................................................................................... 196
Damage and H ealing.....................................................196
Mounted Combat............................................................. 198
Underwater Com bat.......................................................198
P a r t 3
199
C h a p t e r 10: S p e l l c a s t i n g .................................201
What Is a S p ell?...............................................................201
Casting a S p ell................................................................ 202
C h a p t e r 11: S p e l l s .....................................................207
Spell Lists..........................................................................207
Spell D escriptions...........................................................211
A p p e n d i x A : C o n d i t i o n s
290
A p p e n d i x B:
G o d s o f t h e M u l t i v e r s e
293
A p p e n d i x C :
T h e P l a n e s o f E x i s t e n c e
300
The Material Plane........................ ...............................300
Beyond the M aterial..................................301
A p p e n d i x D:
C r e a t u r e S t a t i s t i c s
304
A p p e n d i x E:
I n s p i r a t i o n a l R e a d i n g
312
I n d e x
313
C h a r a c t e r S h e e t
317
No Dots Content.txt:
CONTENTS
PREFACE
4
PART2
171
Worlds of Adventure
5
Using This Book
6
How to Play
6
Adventures
7
CHAPTER
1: STEP-By-STEP
CHARACTERS
ll
Beyond 1st Level
15
CHAPTER
2: RACES
17
Choosing a Race
17
Owarf
18
Elf
21
Hal fiing
26
Human
29
Oragonborn
32
Gnome
35
Half.Elf
38
Half-Orc
40
Tiefling
.42
CHAPTER
3: CLASSES
45
Barbaria n
46
Bard
51
Cleric
56
Oruid
64
Fighter
70
Monk
76
Paladin
82
Ranger
89
Rogue
94
Sorcerer
99
Warlock
105
Wizard
112
CHAPTER
4: PERSONALITY
AND
BACKG ROUN D
121
Character
Oetails
121
Inspiration
125
Backgrounds
125
CHAPTER
5: EQUIPMENT
143
Starting
Equipment...
143
Wealth
143
Armor and Shields
144
Weapons
146
Adventuring Gear
148
Tools
154
Mounts and Vehicles
155
Trade Goods
157
Expenses
157
Trinkets
159
CHAPTER
6: CUSTOMIZATION
OPTIONS
163
MuIticlassi ng
163
Feats
165
ApPENDIX
A: CONDITIONS
290
The Material Planc
300
Beyond the Material
301
199
317
312
313
304
CHAPTER
10: SPELLCASTING
201
What Is a Spell?
201
Casting a Spell
202
CHAPTER
11: SPELLS
:
207
Spe11Lists
207
Spel1 Oescriptions
211
PART3
CHAPTER
7: USING ABILITY
SCORES
173
Ability Scores and Modifiers
173
Advantage and Oisadvantage
173
Proficiency Bonus
173
Ability Checks
174
Using Each Ability
175
Saving Throws
179
CHAPTER
8: ADVENTURING
181
Time
181
Movement
181
The Environment
183
Social Interaction
185
Resting
186
Between Adventures
186
CHAPTER
9: COMBAT
189
The Order of Combat
189
Movement and Position
190
Actions in Combat
192
Making an Attack
193
Cover
196
Oamage and Healing
196
Mounted Combat...
198
Underwater
Combat
198
CHARACTER
SHEET
ApPENDIX
D:
CREATURE
STATISTICS
INDEX
ApPENDIX
C:
THE PLANES OF EXISTENCE
300
ApPENDIX
B:
GODS OF THE MULTIVERSE
293
ApPENDIX
E:
INSPIRATIONAL
READING
5
9
INTRODUCTION
PART 1
Beta Was this translation helpful? Give feedback.
All reactions