Defect #58

Compressed fsockopen data stream not decoding

Added by Ryan Parman 851 days ago. Updated 586 days ago.

Status:Fixed Start:2008-03-30
Priority:Medium Due date:
Assigned to:Geoffrey Sneddon % Done:

0%

Category:HTTP
Target version:1.1.2
Affected Version:

1.1.3

PHP Version:

5.2.4

mbstring enabled:

Yes

iconv enabled:

Yes

cURL enabled:

No

zlib enabled:

Yes


Description

It appears that the gzipped data stream returned from the fsockopen request isn't decoding properly. Normally, we need to strip off the first 10 characters of the data stream (the gzip header) before passing it to gzinflate(). Some servers (I'm looking at you, Microsoft IIS 6.0) don't send back a gzip header at all, so we need to do some additional sniffing in this case.

The feed in question is http://www.xkcd.com/rss.xml

968.patch - Relevant changes in 968 (1.6 KB) Ryan Parman, 2008-03-30 07:23


Related issues

duplicated by Defect #106 Limitation on GZIP Header detection Duplicate 2008-11-12

Associated revisions

Revision 1011
Added by Geoffrey Sneddon 631 days ago

Commit new code for HTTP/fsocket content codings. Hopefully fixes #58. Test plz.

Revision 1017
Added by Geoffrey Sneddon 627 days ago

Merge r1011–r1014 and r1016 into 1.1. Fixes #58.

History

Updated by Ryan Parman 851 days ago

Patched on the trunk @ r968, although I'd like it to get more testing before pushing it out to the branch.

Updated by Ryan Parman 851 days ago

  • Status changed from New to Fixed

Updated by Ryan Parman 851 days ago

Updated by Geoffrey Sneddon 851 days ago

We can't use stuff from the PHP manual, and the license is incompatible. I was working on code for this before, anyway. And blaming IIS/6.0 is wrong. It's us who doesn't comply with the spec.

Updated by Geoffrey Sneddon 851 days ago

To quote Wikipedia's gzip page:

The “Content-Encoding” header in HTTP/1.1 allows clients to optionally receive compressed HTTP responses and (less commonly) to send compressed requests. The standard itself specifies two compression methods: “gzip” (RFC 1952; the content wrapped in a gzip stream) and “deflate” (RFC 1950; the content wrapped in a zlib-formatted stream). Compressed responses are supported by many HTTP client libraries, almost all modern browsers and both of the major HTTP server platforms, Apache and Microsoft IIS. Many server implementations, however, incorrectly implement the protocol by using the raw DEFLATE stream format (RFC 1951) instead. The bug is sufficiently pervasive that most modern browsers will accept both RFC 1951 and RFC 1950-formatted data for the “deflate” compressed method.

PHP < 6 has no built in function to read Gzip streams (it is gzdecode(), FWIW, but it gives no access to any of the metadata). This does that on PHP 5, and should be easily portable to PHP 4. gzuncompress() deals with Zlib streams (the "deflate" type in HTTP/1.1), and gzinflate() deals with raw DEFLATE streams.

For Content-Encoding: gzip, we should use gzdecode() or my class; and for Content-Encoding: deflate, we should try gzuncompress(), but fallback to gzinflate().

Updated by Geoffrey Sneddon 851 days ago

  • Status changed from Fixed to New

Oh, and reopening it.

Updated by Geoffrey Sneddon 792 days ago

  • Assigned to changed from Ryan Parman to Geoffrey Sneddon

Updated by Geoffrey Sneddon 631 days ago

  • Status changed from New to Fixed

Applied in changeset r1011.

Updated by Geoffrey Sneddon 631 days ago

  • Status changed from Fixed to New

Re-opening for 1.1.2.

Updated by Geoffrey Sneddon 627 days ago

  • Status changed from New to Fixed

Applied in changeset r1017.

Updated by Geoffrey Sneddon 620 days ago

  • Subject changed from gzipped fsockopen data stream not decoding correctly. to Compressed fsockopen data stream not decoding

Updated by Adrian Lang 604 days ago

Still broken; unpack doesn't return unsigned ints (see note at http://de3.php.net/unpack), so you need to do the following:

$crc = current(unpack('N', substr($this->compressed_data, $this->position, 4)));
if ($crc < 0) {
$crc += 4294967296;
}

Yet ISIZE is not decoded correctly, don't know what's wrong there.

Updated by Geoffrey Sneddon 604 days ago

  • Affected Version changed from 1.1.1 to 1.1.2

Adrian Lang wrote:

Still broken; unpack doesn't return unsigned ints (see note at http://de3.php.net/unpack), so you need to do the following:

$crc = current(unpack('N', substr($this->compressed_data, $this->position, 4))); if ($crc < 0) { $crc += 4294967296; }

Yet ISIZE is not decoded correctly, don't know what's wrong there.

CRC shouldn't be a problem: both unpack and hexdec will return signed integers and not unsigned, but comparing these still works as they are still bit-for-bit equivalent (expect, maybe, on 64-bit systems?). That work around doesn't work on 32-bit systems. Regardless, this should be a separate ticket.

Updated by Geoffrey Sneddon 586 days ago

  • Affected Version changed from 1.1.2 to 1.1.3

Also available in: Atom PDF