data compression link collection

Arithmetic Coding

Arithmetic coding is a method of encoding data using a variable number of bits. The number of bits used to encode each symbol varies according to the probability assigned to that symbol. Low probability symbols use many bits, high probability symbols use fewer bits. So far, this makes Arithmetic Coding sound very similar to Huffman coding. However, there is an important difference. An arithmetic encoder doesn’t have to use an integral number of bits to encode a symbol. If the optimal number of bits for a symbol is 2.4, a Huffman coder will probably use 2 bits per symbol, whereas the arithmetic encoder my use very close to 2.4. This means an arithmetic coder can usually encode a message using fewer bits. The method by which this is accomplished is somewhat complex, and is explained in some of the links shown below

Arithmetic Coding revealed - A guided tour from theory to praxis

An updated and translated version of our German paper “Proseminar Datenkompression - Arithmetische Kodierung” from 2001. To the best of our knowledge, it is the first comprehensive paper that describes the whole way from the basic principles of AC up to a simple implementation, fully documented with C++ source code.

* * * * *

Posted in November 6th, 2004

Michael Dipperstein’s Airthmetic Coding Page

The third in Michael’s collection of pages explaining lossless compression algorithms. A nice tutorial accompanied by ANSI C source.

* * * * *

Posted in May 2nd, 2004

Five Cents on Arithmetic Encoding

Aliaksei Sanko makes a few improvements to the code in the original 1987 CACM article. His sample includes a templated producer and consumer.


Posted in May 1st, 2004

Compression and Encryption Sources

Links to a variety of lossless coders, includes source for Huffman, arithmetic, LZSS, and other compressors.

* * * * *

Posted in May 1st, 2004

PAQ4 Archiver

The latest in the series of multi-model compressors from Matt Mahoney. This improves on PAQ3n’s remarkable Calgary corpus performance by an additional 12K, at some expense in speed. Takes a whopping 84MB at runtime!


Posted in October 26th, 2003


Matt Mahoney says that with recent improvements by Serge Osnach, PAQ3N does better on the Calgary Corpus than any other open source compressor.

* * * * *

Posted in October 18th, 2003

Arithmetic Coding

Published in Links, Arithmetic Coding

The NIST page on arithmetic coding from their Dictionary of Algorithms and Data Structures. A terse definition and a couple of links.


Posted in March 21st, 2003

Arithmetic Coding Revisited: A New Floating Point Technique

Anand Jain discusses his new technique for arithmetic coding. An executable file is included, but no source code.


Posted in February 22nd, 2003

Compression via Arithmetic Coding in Java

Bob Carpenter has created a nice Java package that implements a PPM/arithmetic coding compression system. This page includes links to the source code, javadocs, and a fair amount of tutorial material. Very complete!

* * * * *

Posted in December 11th, 2002

Effective Arithmetic Coding

Sachin Garg’s article and source on arithmetic coding, includes a 64-bit implementation.

* * *    

Posted in September 3rd, 2002

Arithmetic Coding Revisited

by Moffat, Neal, and Witten. The authors of the original CACM article on arithmetic coding take a fresh look at the topic with an additional ten years of knowledge.

* * * * *

Posted in August 25th, 2002

Parallel Lossless Image Compression Using Huffman and Arithmetic Coding

by P. G. Howard and J. S. Vitter. This paper shows how images can be encoded and decoded using parallel processing. Both Huffman and arithmetic coding are examined.

* * * * *

Posted in July 28th, 2002

Lossless Compression Algorithms (Entropy Encoding)

An overview of the basics, including Shannon-Fano, Huffman, Arithmetic coding, and a section on LZW for good measure.

* * *    

Posted in July 20th, 2002

Pegasus ELS Sample Code

Pegasus has a patented range coder that they license at no charge. This archive contains some C code that provides a sample implementation.


Posted in July 12th, 2002

David Scott’s bijective 2nd order English compressor

This is compressor has static probabilities designed for English language text, encoding ASCII text and spaces only.


Posted in July 2nd, 2002

David Scott’s Bijective Static Entropy Compression

This compressor is designed to operate on English text - it has a static probability table configured for written English.

* * * * *

Posted in June 26th, 2002

Proseminar Datenkompression - Arithmetische Kodierung

This page gives an introduction to Arithmetic Coding and shows how to implement it using floats or integers. There is also a proof of the efficiency of the algorithms, along with visualization and Win32 binaries. This page is in English and includes links to material in both German and English. user Juergen Abel found the site quite good: Clear description, especially the explanation of the renormalisation part, full source code.

* * * *  

Posted in April 15th, 2002

Arithmetic Coding for Data Compression

by P. G. Howard and J. S. Vitter. “Arithmetic Coding for Data Compression,” Proceedings of the IEEE, 82(6), June 1994, 857-865. This paper describes arithmetic coding, and introduces a technique that uses table lookups to make the process more efficient.


Posted in April 8th, 2002

Practical Implementations of Arithmetic Coding

This paper by Paul Howard and Jeff Vitter goes over some of the basics of Arithmetic Coding, then outlines a coder that has increased efficiency by virtue of substituting table lookups for expensive arithmetic operations.


Posted in April 7th, 2002

Arithmetic Coder Visualisation

The author of this code created this visualization executable for a seminar.

DCL reader Drew D. downloaded the code, executed it, and had this to say about it: The program is an executable for windows with a dll and some gif’s. The program seems to require a screen size greater than 800×600 to view the fixed size window. The program is a bit cryptic to me since I only have a basic understanding of Arithmetic encoding but it offers nice step by step encoding with some statistical information. It also allows for the modifying of input types and reading from a file..


Posted in April 1st, 2002

Wikipedia Entry: Arithmetic Coding

The Wikipedia entry for Arithmetic coding. Too short for anything other than a thumbnail sketch.


Posted in January 27th, 2002

Analysis of Arithmetic Coding for Data Compression

A paper by Howard and Vitter, available in both PDF and PS formats. The paper provides an analysis of the effect that models and various implementations of arithmetic coding have on compression.


Posted in December 30th, 2001

US5422734: Method for arithmetically encoding half-tone image in image processing system

An arithmetic coding patent which appears to be currently assigned to Samsung.


Posted in December 24th, 2001

A Two-Stage Modelling Method for Compressing Binary Images by Arithmetic Coding

A paper describing a compression scheme for bitonal images. Complete text of paper in compressed postscript.


Posted in December 8th, 2001

Standard ECMA-159 Data Compression for Information Interchange - Binary Arithmetic Coding Algorithm

The ECMA has a standard defintion for Arithmetic Coding.


Posted in December 5th, 2001

An Improved Data Structure for Cumulative Probability Tables

A paper by Alistair Moffat describing an improvement on Peter Fenwick’s method for maintaining cumulative probability tables. The pointer on this page leads to some source implementing said table.


Posted in November 12th, 2001

Carryless Rangecoder

Mikael Lundqvist has adapted Dmitry Subbotin’s C++ rangecoder to C. Michael says it is simple and fast.
One commented that some more documentation would be helpful.

* * *    

Posted in September 27th, 2001


A bijective arithmetic compressor written in Java. Includes all source code, seems to be free.

* * * *  

Posted in August 11th, 2001

David A. Scott’s Fully Bijective Arithmetic Coder

The term bijective as used by Scott means that for any given given file X you are guarantted that A( B( X ) ) == B( A( X ) ) == X, where A and B are a pair of bijectively matched programs. In this particular case, A and B are a compressor and decompressor that use arithmetic coding. Includes C++ source.

* * *    

Posted in May 1st, 2001

JPEG arithmetic encoding and decoding portable software implementation

A back-end implemenation of arithmetic coding for JPEG as defined in the standard. It is distributed as an add-on that can be used with the Independent JPEG groups library. The work of Guido Vollbeding.

* * * *  

Posted in March 9th, 2001