Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backticks produce extra line breaks #755

Open
klvbdmh opened this issue Nov 30, 2024 · 1 comment
Open

Backticks produce extra line breaks #755

klvbdmh opened this issue Nov 30, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@klvbdmh
Copy link

klvbdmh commented Nov 30, 2024

When parsing text that has backticks ( ` ), spaces after the closing backtick become line breaks.

Steps to reproduce

Using a page from the polars documentation

wget -O docs.pola.rs.unpivot.html https://docs.pola.rs/user-guide/transformations/unpivot/
cat docs.pola.rs.unpivot.html | trafilatura --formatting

Expected

## Eager + lazy

`Eager` and `lazy` have the same API.

Actual

## Eager + lazy

`Eager`

and `lazy`

have the same API.


I also noticed the tables are malformed too, but looks like it's covered by #553

@adbar adbar added the bug Something isn't working label Dec 2, 2024
@adbar
Copy link
Owner

adbar commented Dec 2, 2024

That's correct, this is a bug indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants