Friday, June 21, 2013

Using zlib compression library in Adroid NDK



 Previous parts


 Just one day after writing article on Crunch the packing utility and way how to unpack data created with it I wandered how much overhead it would take to get into zlib library. It was very pleasant for me, that using it is very intuitive and also the implementation was without problems.


Android implementation


 Android NDK already contains library and header file in NDK. The header is in directory:
    \android-ndk-r8b\platforms\android-14\arch-arm\usr\include\zlib.h
and library is indirectory
    \android-ndk-r8b\platforms\android-14\arch-arm\usr\lib\libz.so

 To make it part of your NDK project you have to open Android.mk file and add the following lines into it (first one is just comment):
  #Zlib
  LOCAL_LDLIBS    += -lz

 Now it is enough to include header file "zlib.h" in the beginning of any source file where you want to use it.

 In my engine I build simple wrapper around it to simplify calling of compress / uncompress. I am again using the IPacker interface introduced in previous article. The target is to have unified interface for various packing / unpacking utilities and libraries. I just want to send it ByteBuffer (our class for wrapping array, its length and adding some utilities for working with it) and get another with result back. Just to remember the interface methods are like this:

 // compress data and returns pointer to heap allocated ByteBuffer with it
 virtual SBC::System::Collections::ByteBuffer* compress(u8* aUnpackedData, s32 aSize) = 0;
 virtual SBC::System::Collections::ByteBuffer* compress(
   SBC::System::Collections::ByteBuffer& aUnpackedData) = 0;
 // uncompress data and returns heap allocated ByteBuffer with it
 virtual SBC::System::Collections::ByteBuffer* uncompress(u8* aPackedData, s32 aSize) = 0;
 virtual SBC::System::Collections::ByteBuffer* uncompress(
   SBC::System::Collections::ByteBuffer& aPackedData) = 0;

 There are two things I found important about zlib:
 - it is not Zip, gzip or any other unpacking program (while these use algorithms from zlib heaving the same authors). Do not expect that it will open zip archives you created for example in WinZip program. Zlib has its own internal format,
 - when you compress piece of memory you get result compressed into another piece of memory without any information on how much bytes the original data had. It is up to you to handle it somehow if you do not want to allocate extra large space for uncompression that every file will fit in.


Compress


 Here is the implementaion of compress routines:

//------------------------------------------------------------------------
ByteBuffer* ZlibPacker::compress(u8* aUnpackedData, s32 aSize)
{
 ByteBuffer buffer;
 buffer.construct(aUnpackedData, aSize, false);

 return compress(buffer);
}

//------------------------------------------------------------------------
ByteBuffer* ZlibPacker::compress(ByteBuffer& aUnpackedData)
{
 // maximum length needed for compression
 unsigned long destLength = compressBound(aUnpackedData.getLimit());

 The first one just wraps raw data into ByteBuffer to pass it further (the false says that ByteBuffer is not owner of the data and will not delete it on destruction). In the second the actual work starts. First we call compressBound() method with length of data we want to compress. The method will return maximum length required for target buffer where packed data will be placed. In worst case when there is no compression in result the final size will be size of the original data plus some zlib overhead.

 // add 4 for header (to save original unpacked data length)
 u8* out = new u8[destLength + 4];
 s32 result = ::compress(out + 4, &destLength,
   aUnpackedData.getDataPtr(), aUnpackedData.getLimit());

 if (result != Z_OK)
 {
  switch(result)
  {
  case Z_MEM_ERROR:
   LOGE("note enough memory for compression");
   break;

  case Z_BUF_ERROR:
   LOGE("note enough room in buffer to compress the data");
   break;
  }
 }

 We allocate new buffer for compressed data. We add 4 bytes. This will be our "minimalistic" header keeping information on how big were the original data.
 Notice the double colon before compress() method. It is to force the compiler use non-class uncompress() method. If you omit it you will get compiler error as non compress methods within class takes 4 parameters.
 We put out + 4 as a first parameter, because we want to reserve first 4 bytes in the beginning to write original data size into it later.
 Also note that destLength is handed over like pointer. It is because the length of final compressed data will be handed back in it.

 // create new buffer and copy data into it
 ByteBuffer* packed = new ByteBuffer();
 packed->construct(destLength + 4, out, destLength + 4);

 // delete old buffer - not needed more
 delete [] out;

 Now the data are packed and its final packed length is known. As the zlib works well our initial maximal buffer is probably to big. Part of it is empty just eating memory. So we create new ByteBuffer with capacity needed just for compressed data and 4 bytes header. The ByteBuffer construct copies the requested amount of data into new ByteBuffer.
 We can now free the old (probably partly empty) ByteBuffer.

 // set position to beginning and save unpacked size
 packed->setPosition(0);
 packed->setInt((s32) aUnpackedData.getCapacity());
 packed->setPosition(0);

 return packed;
}

 In the end we will write original (unpacked) data length into the beginning and return the result.


Uncompress


The uncompression looks similarly:

//------------------------------------------------------------------------
ByteBuffer* ZlibPacker::uncompress(u8* aPackedData, s32 aSize)
{
 ByteBuffer buffer;
 buffer.construct(aPackedData, aSize, false);

 return uncompress(buffer);
}

//------------------------------------------------------------------------
ByteBuffer* ZlibPacker::uncompress(ByteBuffer& aPackedData)
{
 // read size of unpacked data
 unsigned long destLength = (u32) aPackedData.getInt();

 // create bytebuffer with sufficient capacity to hold unpacked data
 ByteBuffer* unpacked = new ByteBuffer();
 unpacked->construct(destLength);

 First we read length of uncompressed data. This is not part of zlib and it is up to you to keep it somewhere (or allocate buffer big enough to hold the biggest file you consider to process).

 s32 result = ::uncompress(unpacked->getDataPtr(), &destLength,
   aPackedData.getPositionPtr(), aPackedData.getLimit() - 4);

 if (result != Z_OK)
 {
  switch(result)
  {
  case Z_MEM_ERROR:
   LOGE("note enough memory for uncompression");
   break;

  case Z_BUF_ERROR:
   LOGE("note enough room in buffer to uncompress the data");
   break;

  case Z_DATA_ERROR:
   LOGE("compressed data corrupted or incomplete");
   break;
  }
 }

 Here we call the zlib uncompress() method. The third parameter "aPackedData.getPositionPtr()" returns unsigned char* pointing just after the 4 byte header! Do not forget to skip any headers you added to point to raw zlib data. The position within ByteBuffer automatically moved when we read the size in the very beginning of the method.

 unpacked->setLimit(aPackedData.getLimit() - 4);

 return unpacked;
}

 Finally we adjust some ByteBuffer internal variables and return the result.


Summary


 It is easy to make the zlib work in your project. Remember that it is not UnZip program so just putting some zips into your asset will not work. The zlib library also supports compressing and uncompressing in gzip (.gz format) format and ads gzip file access functions to easily work with it. It is still too fresh for me, so I will have to examine it in future.