The Wayback Machine - https://web.archive.org/web/20200919225313/https://github.com/whatwg/fetch/issues/67
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response content-length header almost always wrong #67

Closed
wanderview opened this issue Jun 24, 2015 · 17 comments
Closed

Response content-length header almost always wrong #67

wanderview opened this issue Jun 24, 2015 · 17 comments

Comments

@wanderview
Copy link
Member

@wanderview wanderview commented Jun 24, 2015

While debugging various issues in our SW implementation I've noticed that we get a lot of warnings about bad content-length headers. It seems this might be a consequence of the current spec:

  1. The network response contains a gzip'd body with a content-length matching the compressed length.
  2. Response.text() and friends return the decoded data. So gzip is stripped.
  3. Therefore respondWith() will always see uncompressed data stream while the content-length header talks about compressed length.

This problem is propagated through things like the Cache API which preserve the headers.

Should we consider fixing up the content-length header once we know the full length of the stream? Or force Response to always use chunked encoding?

Right now I think this is just a nuisance, but it seems it could cause problems in the future.

@annevk
Copy link
Member

@annevk annevk commented Jul 5, 2015

So Content-Length for responses is wrong due to https://fetch.spec.whatwg.org/#concept-http-network-fetch "handling content codings".

Currently Content-Length is not a forbidden header for responses, but it is for requests. That seems broken?

Fixing it for the Cache API but not for network responses seems wrong. I think fundamentally when you have an object the header should not be relevant anymore, but the original value might still be interesting for debugging purposes.

I wonder if @sleevi has any ideas.

@sleevi
Copy link

@sleevi sleevi commented Jul 6, 2015

@annevk Not sure what you meant by

I think fundamentally when you have an object the header should not be relevant anymore, but the original value might still be interesting for debugging purposes.

You're just talking about Content-Length, right? Not headers in general?

I suspect there may be all sorts of weirdness. For example, if a Network request was chunked in the Response, should that be preserved in the Response object if it was loaded from the Cache? Presumably you shouldn't have to preserve the same boundaries - they could just be handled transparently?

@annevk
Copy link
Member

@annevk annevk commented Jul 6, 2015

Yes, just Content-Length. I don't think the Cache API is defined to that level of detail. It's unclear for HTTP range too. @jakearchibald?

@sleevi
Copy link

@sleevi sleevi commented Jul 6, 2015

Yeah, HTTP range requests were the other thing I was thinking about, particularly as they relate to the use of MPEG DASH (which, depending on the serving infrastructure, may either use multiple separate URLs for chunks or use Range requests with a single streaming URL). If you wanted to, say, allow offline streaming (ignoring the ever looming DRM question for 'meaningful' streaming), how would developers want to interact with the Cache, versus what are the security implications of allowing SW control over Range requests (there have been interesting security interactions with Range in the past)

@jakearchibald
Copy link
Collaborator

@jakearchibald jakearchibald commented Jul 17, 2015

@wanderview what does the HTTP cache do here? Does it retain the original content-length header (or absence of) or create its own? Is compression altered? What about range requests, can they be produced from the http cache?

The cache API should be able to produce partial responses for range requests. This would work very well with media elements. I'm less sure about changing the transfer encoding & content-length.

@wanderview
Copy link
Member Author

@wanderview wanderview commented Jul 17, 2015

@jakearchibald the http cache generally stores the on-the-wire data for the response. It does not strip gzip, etc.

This is much harder for Cache API to achieve since we get a Response where Response.body has gzip/etc decoded already. Also, the Response could be synthetic, etc.

So the http cache case has the correct content length because its using the on-the-wire format.

@annevk
Copy link
Member

@annevk annevk commented Jul 23, 2015

The tentative resolution here is that browser implementations should only use Content-Length in the network layer. After that it is no longer relevant, but we'll keep it around in the Response object. It's just no longer accurate necessarily. This is more or less how the specifications will end up being layered anyway, but I might add a note in Fetch.

Any objections?

@wanderview
Copy link
Member Author

@wanderview wanderview commented Jul 23, 2015

Sounds good.

I think we at least need the note to signal intent here. Otherwise it could get confusing for people trying to lock down content-length in the network stack down the line.

@annevk
Copy link
Member

@annevk annevk commented Aug 12, 2015

Shall I put the note as part of https://fetch.spec.whatwg.org/#concept-http-network-fetch or would you like it elsewhere?

@wanderview
Copy link
Member Author

@wanderview wanderview commented Aug 12, 2015

HTTP Network Fetch section sounds reasonable. Thanks.

@annevk annevk closed this in 49d1f1c Aug 12, 2015
@annevk
Copy link
Member

@annevk annevk commented Aug 12, 2015

Thank you!

@Mouvedia
Copy link

@Mouvedia Mouvedia commented Sep 6, 2018

@annevk I don't understand the note.

  1. Why did it became unreliable?
  2. What does it return? What did it return?
  3. What's the reasoning behind the change?
@annevk
Copy link
Member

@annevk annevk commented Sep 6, 2018

@Mouvedia this issue should answer those questions, but the context of the note should as well. If you potentially transform the response before returning it (as implementations do), but keep headers as-is, Content-Length might not be accurate unless the transform didn't affect the size.

@Mouvedia
Copy link

@Mouvedia Mouvedia commented Sep 6, 2018

So the definition of what the content-length represents changes after the completion?
It's only kept as an artifact, correct?

@annevk I am interested in its value during streaming; if exposed, sent and accurate will it still be equivalent to XHR's ProgressEvent#total?

@annevk
Copy link
Member

@annevk annevk commented Sep 7, 2018

What Content-Type represents doesn't change, the contents of the response change, thereby indeed making it an artifact.

@Mouvedia it should never be different from ProgressEvent's total for H/1 connections at least. I'm not sure how H/2 deals with incorrect Content-Length values.

@Mouvedia
Copy link

@Mouvedia Mouvedia commented Sep 7, 2018

Interesting, I didn't know that the Content-Length value may be overwritten by the browser: in my experience an inaccurate value set on the server was passed unchanged to the client.

@annevk
Copy link
Member

@annevk annevk commented Sep 7, 2018

It doesn't get overwritten?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
5 participants
You can’t perform that action at this time.