data compression link collection

LZ77/LZSS and derivatives

This topic encompasses the LZ77 algorithm and its descendant, LZSS. This is one of two seminal LZ compression algorithms developed in the late 70s. LZSS forms the core of the popular deflate algorithm when combined with a Huffman encoder on the back end.

PyLZMA homepage

PyLZMA allows to use LZMA SDK in Python, using LZMA compression by Igor Pavlov


Posted in July 2nd, 2007

A Universal Algorithm for Sequential Data Compression

The 1977 paper describing an algorithm for compression using pointers to previously seen text. This algorithm, later known as LZ77, is still one of the most widely used techniques for lossless data compression in use today.

Update 2004: Document is now packed in RAR format.


Posted in May 16th, 2004

LZMA SDK From 7-Zip

Igor Pavlov has released his LZMA code in a separate SDK, and is claiming excellent performance characteristics that make this a potential hit in the embedded world.

* * * * *

Posted in February 22nd, 2004


rzip is a compression program, similar in functionality to gzip or bzip2, but able to take advantage long distance redundencies in files, which can sometimes allow rzip to produce much better compression ratios than other programs. The original idea behind rzip is described in my PhD thesis (see, but the implementation in this version is considerably improved from the original implementation. The new version is much faster and also produces a better compression ratio.


Posted in February 14th, 2004


JCALG1 is a small, open-source, LZSS derived compression library.

  • Features Coded in 100% 32bit x86 assembly language for maximum performance and minimum size.
  • Good compression ratio, typically much better than ZIP’s deflate.
  • Extremely small and fast decompressor.
  • Adjustable window size to allow for faster compression at the cost of compression ratio.
  • Decompression requires no memory, other than the destination buffer.
  • Easy integration with any application.
  • Free!


Posted in January 11th, 2004

Michael Dipperstein’s LZSS Code Page

Michael Dipperstein describes his personal quest for understanding and implementation of LZSS coding. Full source included.

* * * * *

Posted in December 3rd, 2003

Redundanz - Lempel-Ziv-Kodierung

A lecture in German on LZ coding.

* * *    

Posted in September 28th, 2003


An Open Source library that implements an LZSS algorithm, designed for speed. ANSI C, with 16- and 32-bit x86 assembler versions available as well.


Posted in September 13th, 2003


Compreso implements compressed sockets using an LZH algorithm, as implemented by Rolando Herrero. It does this cooperatively with the Win32 socket library, so you can only run this code under Windows. Freeware.


Posted in June 9th, 2003

The Standard Function Library: Compression Functions

The guys at iMatix had the idea that they could write a super-library of C functions that woud be so useful it would rule the world. As far as I can tell, it didn’t catch on. However, there are a few compression functions here that some folks might find interesting.


Posted in May 4th, 2003

Free DCL Decompressor

Mark Adler built a decompressor that is able to read streams built with PKWare’s Data Compression Library. Since PKWare hasn’t released source for DCL, this is a very good thing, and free to boot.


Posted in February 27th, 2003

McKee’s Directed Acyclic Graph Compression

Will McKee has released this as freeware - includes complete source to a string substitution compressor. From the description it sounds as though it’s variant on LZSS, but I’ll defer to anyone willing to do a real analysis.


Posted in December 9th, 2002

ITU Recommendation V.44

This is the data compression standard that implements the LZJH algorithm, and is used in V.90 and V.92 modems. The ITU wants to charge you a few bucks for this standard, but if you believe the post from Pete Fraser (listed elsewhere on you can get three free standards per year. Maybe this ought to be one of them.


Posted in November 9th, 2002

Jonathan Bennet’s C++ implementation of LZSS

A C++ implementation of the LZSS / LZ77 algorithm. Also contains a description of the LZSS algorithm and my implementations of it as I learned more about it (hashing, lazy evaluation, etc.) All the code from my first attempt to the current version is included.

An anonymous visitor to Jonathan’s page said it was “Pertinent, very useful, relevant, just what I needed.”

* * * * *

Posted in October 30th, 2002

High School Kids Win Prizes for Compression Algorithm

A couple of high school kids from Saratoga, CA, were regional winners in the Siemens Westinghouse Science and Technology competition.


Posted in October 12th, 2002

TCompress Component Set

File and database compression components for Delphi. Compress to/from file, memory, and blob. Uses RLE, LZH, and LZW. Shareware.

Version 7.0 was released in September of 2002.

* * * *  

Posted in September 15th, 2002

UPL Compression : the complete professional toolkit

The UPL Compression Library is a high-performance professional compression library. It offers the ability to compress and decompress data, buffers, strings or single files and features the latest innovations in data compression. The library offers eight extremely powerful compression algorithms. Dynamic Huffman, Arithmetic, BWT, Ppm and several Lempel Ziv flavors. user John G. had this to say: I was looking for adding a better compression to my Visual Basic project and it worked like a charm. The compression ratio is really good, better than Zip!

* * * *  

Posted in August 27th, 2002

Improving the Speed of LZ77 Compression by Hashing and Suffix Sorting

by Kunihiko Sadakane, Hiroshi Imai. This paper proposes two new methods of performing fast string matching in LZ77 compression. One method uses a new hashing algorithm, the other uses suffix sorting.


Posted in July 16th, 2002


Ada source for compression based on the LZH package.

* * * * *

Posted in June 8th, 2002

More PKWare DCL Decompression Code

C++ code posted to comp.compression that describes extraction from PKWare’s Data Compression Library.

Update: The author posted this correction to comp.compression:
There’s a bug in the code posted 2001-10-07 19:36:38 PST. To fix:
In the

void tcDecoder::Decode(char *apBuffer, unsigned int *apSize, unsigned int anBufferSize)

function after both

if (lnIndex == mnCurrentPos) lnIndex = lnStartIndex;


if (lnIndex == mnDictionarySize) lnIndex = 0;


Posted in April 22nd, 2002

Dictionary Selection using Partial Matching

D. T. Hoang, P. M. Long and J. S. Vitter. “Dictionary Selection using Partial Matching,” Information Sciences, 119(1-2), 57-72, 1999. This paper describes an attempt to squeeze improved compression out of existing dictionary-based schemes by using multiple context-based dictionaries for encoding.


Posted in April 8th, 2002

Tlzrw1 : Delphia compression component with LZH and LZRW1/KH

The LZH and LZRW1/KH routines are from the SWAG Pascal code archive.


Posted in February 26th, 2002

TLZHCompressor a compression component for Delphi

This unit implements a component which allows the user to compress data using a combination of LZSS compression and adaptive Huffman coding (Similar to that use by LHARC 1.x), or conversely to decompress data that was previously compressed by this unit.


Posted in February 26th, 2002

DCompress v1.00 library

For Delphi and other Windows compilers. Compression/ Decompression routines .DLL library. Mostly assembler, Fast decompression!


Posted in February 26th, 2002

LZSLib - a windows compression .DLL for windows programmers.

LZSSLib is a compression library (DLL) for Windows programmers. You have access to compression/decompression functions permitting file-to-file operations. LZSSLib uses the LZSS algorithm with various modifications each providing different enhancements. Very simple to use: LZSSPackFile(’PROG.EXE’, ‘PROG.LZS’) Works with any language that supports DLL calling, such as Turbo Pascal, C/C++, Actor, Visual Basic, Realizer, even ObjectVision.

* * * * *

Posted in February 26th, 2002

PKWare Data Compression Library Format

In this comp.compression posting, Ben Rudia-Gould opens up the compression format used by the PKWare Data Compression Library. This is the only place I have ever seen this information disclosed; PKWare has certainly not done so.

* * * * *

Posted in September 22nd, 2001

The LZSS Algorithm

The Data Compression Center gives an explanation of LZSS coding.
This link points to an archived site, as the original has disappeared. Links on the archived page may or may not work properly.


Posted in December 2nd, 2000

LZ77 Daten - Dekompression auf dem 68HC11 Micro Controller

Christian Scheurer wrote up his LZ77 project that was targeted to the 68HC11 processor. It’s written in German, perhaps you could use Babel-fish to translate. Source code included.


Posted in September 24th, 2000

Collake Software - JCALG1

Home of JCALG1, an LZSS derived lossless compression algorithm with full x86 32bit assembly source. Data Compressioni Library user comment: I found LZSS C source and an EXE. The EXE was useful for testing. I expect to use this in an embedded app after further research..

* * *    

Posted in May 16th, 2000

An Optimizing Hybrid LZ77 RLE Data Compression Program

An Optimizing Hybrid LZ77 RLE Data Compression Program, aka Improving Compression Ratio for Low-Resource Decompression by Pasi Ojala.

Presents a new literal tagging system, a fast exhaustive string
match algorithm, an optimal parsing algorithm, and results on
Calgary Corpus and Canterbury Corpus.

* * * *  

Posted in December 13th, 1999