Make iter_lines 80x faster #2300
Unanswered
maxmouchet
asked this question in
Ideas
Replies: 1 comment 2 replies
-
Thank you for this idea! It is likely that a performance enhancement could be made along your idea of using However, I do not think that we want to introduce any behavior changes to that method - in particular, the “there is an additional empty line when I think it should be possible to maintain the current method behavior by using
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
HTTPX
iter_lines()
method is currently pretty slow in comparison with requests. Here is a proof of concept inspired by requests's code (models.py#L853-L885): master...maxmouchet:httpx:faster-iter-lines.As shown below, it is about 80x faster than the current implementation (and much simpler!), although it slightly changes the output.
What are your thoughts on this?
Breaking changes
["a", "", "b"]
instead of["a\n", "\n", "b\n"]
.\r\n
is split on two chunks between\r
and\n
, it will output an an additional new line:["a", ""]
instead of["a\n"]
.\n
,\r
and\r\n
. See https://docs.python.org/3/library/stdtypes.html#str.splitlines.Benchmark
Time to iterate over 182634 lines of 9185 characters on average
My specific input file can be downloaded here, but any other file will do.
Beta Was this translation helpful? Give feedback.
All reactions