欢迎光临
我们一直在努力

python zlib解压缩gzip数据流出错的解决方法

zlib解压缩gzip数据,经常报如下错误:

Error -3 while decompressing data: incorrect header check

错误提示信息很明确,就是解压缩时检测到头部不正确

这是因为你的数据可能是如下格式:

deflate

zlib

gzip

需要分别按如下方式来解压缩:

  • deflate: wbits = -zlib.MAX_WBITS
  • zlib: wbits = zlib.MAX_WBITS
  • gzip: wbits = zlib.MAX_WBITS | 16

当然还有头部自动检测的解压方法

参考如下代码:

#-*- coding=utf-8 -*-  
#!/usr/bin/env python  

import zlib

deflate_compress = zlib.compressobj(9, zlib.DEFLATED, -zlib.MAX_WBITS)
zlib_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS)
gzip_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS | 16)

text = '''test'''
deflate_data = deflate_compress.compress(text) + deflate_compress.flush()
zlib_data = zlib_compress.compress(text) + zlib_compress.flush()
gzip_data = gzip_compress.compress(text) + gzip_compress.flush()

print "1," + zlib.decompress(zlib_data)

try:
    print "21," + zlib.decompress(deflate_data)
except Exception, e:
    print "22," + repr( e )
print "23," + zlib.decompress(deflate_data, -zlib.MAX_WBITS)
    
try:
    print "31," + zlib.decompress(gzip_data)
except Exception, e:
    print "32," + repr( e )
print "33," + zlib.decompress(gzip_data, zlib.MAX_WBITS|16)

# 头部自动检测
print "41," + zlib.decompress(gzip_data, zlib.MAX_WBITS|32)
print "42," + zlib.decompress(zlib_data, zlib.MAX_WBITS|32)

另外,也可以使用gzip来解压缩gzip数据流

import gzip
import StringIO
fio = StringIO.StringIO(gzip_data)
f = gzip.GzipFile(fileobj=fio)
print "51," + f.read()
f.close()

附参考文章地址及原文摘要:

http://stackoverflow.com/questions/3122145/zlib-error-error-3-while-decompressing-incorrect-header-check

You have this error:

zlib.error: Error -3 while decompressing: incorrect header check

Which is most likely because you are trying to check headers that are not there, e.g. your data followsRFC 1951 (deflate compressed format) rather thanRFC 1950 (zlib compressed format) orRFC 1952 (gzip compressed format).

choosing windowBits

But zlib can decompress all those formats:

  • to (de-)compress deflate format, use wbits = -zlib.MAX_WBITS
  • to (de-)compress zlib format, use wbits = zlib.MAX_WBITS
  • to (de-)compress gzip format, use wbits = zlib.MAX_WBITS | 16

See documentation in http://www.zlib.net/manual.html#Advanced (sectioninflateInit2)

examples

test data:

>>> deflate_compress = zlib.compressobj(9, zlib.DEFLATED, -zlib.MAX_WBITS)
>>> zlib_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS)
>>> gzip_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS | 16)
>>> 
>>> text = '''test'''
>>> deflate_data = deflate_compress.compress(text) + deflate_compress.flush()
>>> zlib_data = zlib_compress.compress(text) + zlib_compress.flush()
>>> gzip_data = gzip_compress.compress(text) + gzip_compress.flush()
>>> 

obvious test for zlib:

>>> zlib.decompress(zlib_data)
'test'

test for deflate:

>>> zlib.decompress(deflate_data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check
>>> zlib.decompress(deflate_data, -zlib.MAX_WBITS)
'test'

test for gzip:

>>> zlib.decompress(gzip_data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check
>>> zlib.decompress(gzip_data, zlib.MAX_WBITS|16)
'test'

the data is also compatible with gzip module:

>>> import gzip
>>> import StringIO
>>> fio = StringIO.StringIO(gzip_data)
>>> f = gzip.GzipFile(fileobj=fio)
>>> f.read()
'test'
>>> f.close()

automatic header detection (zlib or gzip)

adding 32 to windowBits will trigger header detection

>>> zlib.decompress(gzip_data, zlib.MAX_WBITS|32)
'test'
>>> zlib.decompress(zlib_data, zlib.MAX_WBITS|32)
'test'

using gzip instead

or you can ignore zlib and use gzip module directly; butplease remember that under the hood,gzip uses zlib.

fh = gzip.open('abc.gz', 'rb')
cdata = fh.read()
fh.close()
赞(1)
版权归原作者所有,如有侵权请告知。达维营-前端网 » python zlib解压缩gzip数据流出错的解决方法

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址