This topic encompasses the LZ77 algorithm and its descendant, LZSS. This is one of two seminal LZ compression algorithms developed in the late 70s. LZSS forms the core of the popular deflate algorithm when combined with a Huffman encoder on the back end.
PyLZMA homepage
PyLZMA allows to use LZMA SDK in Python, using LZMA compression by Igor Pavlov
http://www.joachim-bauch.de/projects/python/pylzma/
A Universal Algorithm for Sequential Data Compression
The 1977 paper describing an algorithm for compression using pointers to previously seen text. This algorithm, later known as LZ77, is still one of the most widely used techniques for lossless data compression in use today.
Update 2004: Document is now packed in RAR format.
http://www.compression.ru/download/articles/lz/ziv_lempel_1977_universal_algorithm_pdf.rar
LZMA SDK From 7-Zip
Igor Pavlov has released his LZMA code in a separate SDK, and is claiming excellent performance characteristics that make this a potential hit in the embedded world.
rzip
rzip is a compression program, similar in functionality to gzip or bzip2, but able to take advantage long distance redundencies in files, which can sometimes allow rzip to produce much better compression ratios than other programs. The original idea behind rzip is described in my PhD thesis (see http://samba.org/~tridge/), but the implementation in this version is considerably improved from the original implementation. The new version is much faster and also produces a better compression ratio.
JCALG1
JCALG1 is a small, open-source, LZSS derived compression library.
- Features Coded in 100% 32bit x86 assembly language for maximum performance and minimum size.
- Good compression ratio, typically much better than ZIP’s deflate.
- Extremely small and fast decompressor.
- Adjustable window size to allow for faster compression at the cost of compression ratio.
- Decompression requires no memory, other than the destination buffer.
- Easy integration with any application.
- Free!
http://www.collakesoftware.com/jcalg1.htm
Michael Dipperstein’s LZSS Code Page
Michael Dipperstein describes his personal quest for understanding and implementation of LZSS coding. Full source included.
http://michael.dipperstein.com/lzss/
Redundanz - Lempel-Ziv-Kodierung
BriefLZ
An Open Source library that implements an LZSS algorithm, designed for speed. ANSI C, with 16- and 32-bit x86 assembler versions available as well.
Compreso
Compreso implements compressed sockets using an LZH algorithm, as implemented by Rolando Herrero. It does this cooperatively with the Win32 socket library, so you can only run this code under Windows. Freeware.
http://www.compreso.com.ar/compreso.htm
The Standard Function Library: Compression Functions
The guys at iMatix had the idea that they could write a super-library of C functions that woud be so useful it would rule the world. As far as I can tell, it didn’t catch on. However, there are a few compression functions here that some folks might find interesting.
http://www.imatix.com/html/sfl/sfl17.htm#TOC30
Free DCL Decompressor
Mark Adler built a decompressor that is able to read streams built with PKWare’s Data Compression Library. Since PKWare hasn’t released source for DCL, this is a very good thing, and free to boot.
http://www.alumni.caltech.edu/~madler/blast10.tar.gz
McKee’s Directed Acyclic Graph Compression
Will McKee has released this as freeware - includes complete source to a string substitution compressor. From the description it sounds as though it’s variant on LZSS, but I’ll defer to anyone willing to do a real analysis.
http://www.cjkware.com/wamckee/mcdag.zip
ITU Recommendation V.44
This is the data compression standard that implements the LZJH algorithm, and is used in V.90 and V.92 modems. The ITU wants to charge you a few bucks for this standard, but if you believe the post from Pete Fraser (listed elsewhere on DataCompression.info) you can get three free standards per year. Maybe this ought to be one of them.
http://www.itu.int/rec/recommendation.asp?type=items&lang=e&parent=T-REC-V.44-200011-I
Jonathan Bennet’s C++ implementation of LZSS
A C++ implementation of the LZSS / LZ77 algorithm. Also contains a description of the LZSS algorithm and my implementations of it as I learned more about it (hashing, lazy evaluation, etc.) All the code from my first attempt to the current version is included.
An anonymous visitor to Jonathan’s page said it was “Pertinent, very useful, relevant, just what I needed.”
http://www.hiddensoft.com/cgi-bin/countdown.pl?code/LZSS.zip
High School Kids Win Prizes for Compression Algorithm
A couple of high school kids from Saratoga, CA, were regional winners in the Siemens Westinghouse Science and Technology competition.
http://www.svcn.com/archives/saratoganews/01.10.01/edu-0102.html
TCompress Component Set
File and database compression components for Delphi. Compress to/from file, memory, and blob. Uses RLE, LZH, and LZW. Shareware.
Version 7.0 was released in September of 2002.
http://www.spis.co.nz/compress.htm
UPL Compression : the complete professional toolkit
The UPL Compression Library is a high-performance professional compression library. It offers the ability to compress and decompress data, buffers, strings or single files and features the latest innovations in data compression. The library offers eight extremely powerful compression algorithms. Dynamic Huffman, Arithmetic, BWT, Ppm and several Lempel Ziv flavors.
DataCompression.info user John G. had this to say: I was looking for adding a better compression to my Visual Basic project and it worked like a charm. The compression ratio is really good, better than Zip!
Improving the Speed of LZ77 Compression by Hashing and Suffix Sorting
by Kunihiko Sadakane, Hiroshi Imai. This paper proposes two new methods of performing fast string matching in LZ77 compression. One method uses a new hashing algorithm, the other uses suffix sorting.
http://citeseer.nj.nec.com/sadakane00improving.html
lz.adb
Ada source for compression based on the LZH package.
http://www.mysunrise.ch/users/gdm/lz__adb.htm
More PKWare DCL Decompression Code
C++ code posted to comp.compression that describes extraction from PKWare’s Data Compression Library.
Update: The author posted this correction to comp.compression:
There’s a bug in the code posted 2001-10-07 19:36:38 PST. To fix:
In the
void tcDecoder::Decode(char *apBuffer, unsigned int *apSize, unsigned int anBufferSize)
function after both
if (lnIndex == mnCurrentPos) lnIndex = lnStartIndex;
add
if (lnIndex == mnDictionarySize) lnIndex = 0;
Dictionary Selection using Partial Matching
D. T. Hoang, P. M. Long and J. S. Vitter. “Dictionary Selection using Partial Matching,” Information Sciences, 119(1-2), 57-72, 1999. This paper describes an attempt to squeeze improved compression out of existing dictionary-based schemes by using multiple context-based dictionaries for encoding.
http://www.cs.duke.edu/~jsv/Papers/catalog/node73.html
Tlzrw1 : Delphia compression component with LZH and LZRW1/KH
The LZH and LZRW1/KH routines are from the SWAG Pascal code archive.
http://www.programmersheaven.com/zone2/cat56/14494.htm
TLZHCompressor a compression component for Delphi
This unit implements a component which allows the user to compress data using a combination of LZSS compression and adaptive Huffman coding (Similar to that use by LHARC 1.x), or conversely to decompress data that was previously compressed by this unit.
http://www.programmersheaven.com/zone2/cat56/6044.htm
DCompress v1.00 library
For Delphi and other Windows compilers. Compression/ Decompression routines .DLL library. Mostly assembler, Fast decompression!
http://www.programmersheaven.com/zone15/cat158/16272.htm
LZSLib - a windows compression .DLL for windows programmers.
LZSSLib is a compression library (DLL) for Windows programmers. You have access to compression/decompression functions permitting file-to-file operations. LZSSLib uses the LZSS algorithm with various modifications each providing different enhancements. Very simple to use: LZSSPackFile(’PROG.EXE’, ‘PROG.LZS’) Works with any language that supports DLL calling, such as Turbo Pascal, C/C++, Actor, Visual Basic, Realizer, even ObjectVision.
http://www.programmersheaven.com/zone15/cat158/6779.htm
PKWare Data Compression Library Format
In this comp.compression posting, Ben Rudia-Gould opens up the compression format used by the PKWare Data Compression Library. This is the only place I have ever seen this information disclosed; PKWare has certainly not done so.
The LZSS Algorithm
The Data Compression Center gives an explanation of LZSS coding.
This link points to an archived site, as the original has disappeared. Links on the archived page may or may not work properly.
LZ77 Daten - Dekompression auf dem 68HC11 Micro Controller
Christian Scheurer wrote up his LZ77 project that was targeted to the 68HC11 processor. It’s written in German, perhaps you could use Babel-fish to translate. Source code included.
http://www.mountpoint.ch/unique/project/t-rip/
Collake Software - JCALG1
Home of JCALG1, an LZSS derived lossless compression algorithm with full x86 32bit assembly source. Data Compressioni Library user comment: I found LZSS C source and an EXE. The EXE was useful for testing. I expect to use this in an embedded app after further research..
http://www.collakesoftware.com/
An Optimizing Hybrid LZ77 RLE Data Compression Program
An Optimizing Hybrid LZ77 RLE Data Compression Program, aka Improving Compression Ratio for Low-Resource Decompression by Pasi Ojala.
Presents a new literal tagging system, a fast exhaustive string
match algorithm, an optimal parsing algorithm, and results on
Calgary Corpus and Canterbury Corpus.
http://www.cs.tut.fi/~albert/Dev/pucrunch/