-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-128271: fix incorrect handling of negative read sizes in HTTPResponse.read()
#128270
base: main
Are you sure you want to change the base?
Conversation
Param value `amt` can be negative, e.x. -1. In that case we read all data, with chunk separators, instead of correct reading. P.S. example of using -1 here: https://github.com/Textualize/rich/blob/master/rich/progress.py#L247
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Should I create an issue/bug before that PR? |
Yes. Now, the prototype of while (chunk_left := self._get_chunk_left()) is not None:
if amt is not None and amt <= chunk_left:
value.append(self._safe_read(amt))
self.chunk_left = chunk_left - amt
break We have an issue with
Btw, negative reading size means "read everything you can" (see: https://docs.python.org/3/library/io.html#io.BufferedIOBase.read). |
I created issue: #128271 |
HTTPResponse.read()
Considering the tests are failing and that the implementation should handle negative reading sizes as being equivalent to |
By the way, we need some tests for that. But focus on fixing the implementation first and I can help you for the tests later. |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
Misc/NEWS.d/next/Library/2024-12-26-11-00-03.gh-issue-128271.mCcw3B.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this is something I haven't seen in the test_chunked()
, but shouldn't we set {'Transfer-Encoding': 'chunked'}
as well? Just to be sure, can you locally check that resp.chunked
is true? (to check that we are indeed using the method you fixed)
FYI, we've had a similar change done in urllib3 urllib3/urllib3#3356 |
Thanks for the information. We could also change our own |
|
Oh yes, the docs don't mention that None is accepted. It's fine now (though I'm wondering whether we should accept it or not. Since no one complained about it (or maybe there is an opened issue for that), let's not change it for now (neither here, nor in an another PR)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could refactor one day the tests since we always do the same thing but let's leave it for another (possible) PR. Otherwise, address the last comment on the NEWS entry and I think we're good to go (although a core dev (which I'm not) would need to merge your PR).
Misc/NEWS.d/next/Library/2024-12-26-11-00-03.gh-issue-128271.mCcw3B.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Implementation-wise, I don't know whether it's better to handle a negative size in read()
or in _read_chunked()
but considering we're handling negative sizes in _read1_chunked()
, it makes sense to do the same in _read_chunked()
.
Maybe some comments from @SethMichaelLarson or @illia-v (who were involved in the urllib3 PR)? |
It looks like handling this in Lines 472 to 501 in 64173cd
|
Note: I didn't quite understand whether you confirmed or not my suggestion of keeping it I think it's probably better to keep it in while (chunk_left := self._get_chunk_left()) is not None:
value.append(self._safe_read(chunk_left)) If the HTTP response is meant to be read by chunks, we should do it so since "read as many bytes you can in chunks" and "read as many bytes you can in one go" may not be equivalent (depending on how the server sends the data, right?) |
Let me clarify my point 🙂 This PR makes But if the response is not chunked, there are still different code paths for negative ints and diff --git a/Lib/http/client.py b/Lib/http/client.py
index 1c0332d82bd..33a858d34ae 100644
--- a/Lib/http/client.py
+++ b/Lib/http/client.py
@@ -472,7 +472,7 @@ def read(self, amt=None):
if self.chunked:
return self._read_chunked(amt)
- if amt is not None:
+ if amt is not None and amt >= 0:
if self.length is not None and amt > self.length:
# clip the read to the "end of response"
amt = self.length |
This is one is not really needed because we would bypass |
In most cases it's not needed indeed, but there is some logic related to raising |
Param value
amt
can be negative, e.x. -1.In that case we read all data, with chunk separators, instead of correct reading.
P.S. example of using -1 here:
https://github.com/Textualize/rich/blob/master/rich/progress.py#L247
HTTPResponse.read()
#128271