Benchmarks are an important part of the data compression world. Performance against benchmarks is a good way to judge algorithms in a fair manner. The problem, of course, is selecting benchmarks that accurately model the needs of the eventual users of the algorithm
Maximum Compression
Werner Bergmans has created a new benchmark site that aims to show the best compression ratios possible for multiple file types, including English text, executables, graphics, and so on. Werner says he is running these tests with 80-100 programs for each file type!
Reader Werner B. says Useful site to compare results of different compression programs. Regularly updated.
http://www.maximumcompression.com/
Berto’s Compression Spreadsheet
Comparisions of over 230 archivers, in handy Excel format, from Berto.
Reader Emiliano C. said “Wonderful! Great! Wonderful! Cool!”
http://cs.fit.edu/~mmahoney/compression/ct.xls
Silesia compression corpus
Sebastian Deorowicz decided to create a compression corpus of his own, attempting to overcome some of the deficiencies he sees in the old guard.
http://sun.iinf.polsl.gliwice.pl/~sdeor/corpus.htm
Benchmark Images and Files
David Cary is a major link farmer. One of the sections of his massive Data Compression page has links to various images and files that are used in various benchmarks.
http://agora.rdrop.com/~cary/html/data_compression.html#benchmark
Archive Comparsion Test 2.0
ACT - by Jeff Gilchrist. ACT is the Archive Comparison Test, a long running benchmark on well known archiving programs. Lots of good updates in May of 2002.
Jeff Gilchrist
This is Jeff Gilchrist’s home page. Jeff is the curator of the Archive Compression Test, which presumably keeps him busy.
Neural Network Text Compression Programs and Papers
A couple of programs using neural networks for compression, along with a couple of papers by the author. This area of data compression is definitely underserved, check out what’s here and see if neural networks deserve more attention than they are getting.
Update: This page appears to now have some links to general lossless benchmarking info.
http://cs.fit.edu/~mmahoney/compression/
Waterloo BragZone test suite
In the BragZone you will find the following:
ftp://links.uwaterloo.ca/pub/BragZone
Waterloo BragZone
Comparing different image compression programs has always been difficult. As a suite of test images and a place for archiving results, the Waterloo BragZone hopes to overcome these problems. Central to the effort is the Waterloo Repertoire, a suite of 32 test images
http://links.uwaterloo.ca/bragzone.base.html
PNG Suite from Willem van Schaik
This is Willem van Schaik’s suite of PNG icons for testing PNG decoder engines, PNG viewers, and PNG browsers.
http://www.libpng.org/pub/png/pngsuite.html
Where can I find Lenna and other images?
The comp.compression FAQ attempts to answer this for you.
http://www.faqs.org/faqs/compression-faq/part1/section-30.html
yabbawhap - Y and AP compression filters
Public domain code by Daniel Bernstein. (Note that this ftp site has an excellent selection of compressoin programs and code.)
ftp://ftp.inria.fr/system/arch-compr/yabba.tar.Z
The Canterbury Corpus
This is the home page for the Canterbury Corpus, a test suite designed to provide a standard set of files for lossless compressoion testing. You will find links to the actual files in the test suite, as well as papers and test results.
http://corpus.canterbury.ac.nz/
The British National Corpus
The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of current British English, both spoken and written.
The Calgary Corpus
This is the home page for the Calgary Corpus. This set of files has long been the standard used for comparison of various lossless compression techniques.
http://links.uwaterloo.ca/calgary.corpus.html
Compression Ratios
A set of benchmarks for lossless compression of various test sets, including the CCITT B&W images, the Calgary Corpus, and a Gray Scale set. Includes some dates for checking historical progression.
http://www.cs.waikato.ac.nz/~singlis/ratios.html
The Calgary Corpus Compression Challenge
Leonid A. Broukhis puts his money where his mouth is by offering a cash prize for good, reproducible compression. He has paid out at least one modest prize.
http://www.mailcom.com/challenge
Calgarry Corpus test results
A set of test results for files run against the Calgary Corpus. This set of test results are kept on the Canterbury web site so that they can be easily referenced for comparison purposes.
http://corpus.canterbury.ac.nz/details/
An FTP site for the Calgary Corpus
The Calgary Corpus is a set of files that were put together by compression mavens Bell, Cleary, and Witten in 1989 for benchmarking lossless compression algorithms. Files included in this set include English text, source code, executable code, and some data files.
ftp://ftp.cpsc.ucalgary.ca/pub/projects/text.compression.corpus/