## Info

Lecturer: Boris Kudryashov

Language: English

Lecture: Monday, 10:15 -- 12:00, Delta, 2047

Practice session: Wednesday, 10:15 -- 12:00, Delta, 2045

Books for the course:

- M. Nelson and J.-L. Gailly, "The data compression book 2nd edition," M & T Books, New York, NY, (1995).
- I. Bocharova, "Compression for multimedia," Cambridge Univ. Press (is sold via INTERNET)

- D. Salomon, and G. Motta, "Handbook of data compression," Springer Science & Business Media, 2010.

Contact:

Boris Kudryashov (boriskud@ut.ee)

Vitaly Skachek

## General information

This course is primarily intended for Bachelor's level students and is recommended for Master's students with an interest in data compression techniques.

First, we will answer the questions

- why for a given data the required storage space can be reduced?
- how much space is necessary for the source with known probabilistic model?

Information Theory gives precise answers to these questions. Optimum codes for sources with known statistical models are known from the dawn of information theory. Surprisingly, the near-optimum coding efficiency can be achieved also in the case if there is no a priori information about source statistics. We will study elegant coding algorithms combining high efficiency and low complexity.

Nowadays most of the data transmitted over communication networks is analog data represented in digital formats. The quality loss is inevitable consequence of the analog-to-digital conversion.

The required data rate vs quality trade-off is established by theorems of the rate-distortion theory. However, unlike the lossless coding theory, there is no universal lossy coding which is equally efficient for all types of multimedia data. Contemporary lossy coding techniques rely on the physical models of data sources and human perceptual sensitivity to different distortions. Standard coding techniques are based on complicated mathematical models and computational algorithms. Nevertheless, we will reveal most of the secrets of high efficiency of ACELP coding for speech, AAC and MP3 coding for music, JPEG and MPEG coding for still images and video.

# Contents

**Lectures and practice sessions?**

N | Lecture | Practice session |
---|---|---|

1 | Measuring the quantity of information. Basic notions of information theory | Computing achievable data rates for examples of discrete sources |

2 | Uniform and nonuniform source codes | Applications of Huffman codes to different scenarios |

3 | Arithmetic coding | Practical implementation |

4 | Off-line universal coding | Examples. Implementation issues |

5 | On-line universal coding for memoryless sources | Examples. Implementation issues |

6 | Ideas behind low-complexity universal coding techniques. Interval coding and recency rank coding. LZ-77 algorithm | Monotonic codes. LZ-77 example |

7 | LZW algorithm, PPM algorithm | Examples. Implementation issues |

8 | Burrows-Wheeler transform and its application. Overview of the best archivers | A BWT-based data compression algorithm |

9 | Fundamentals of rate-distortion theory. Measuring of analog information. Sampling. Transforms. Quantization | Scalar quantization. Rate-distortion functions for Gaussian variables and processes |

10 | Optimum scalar quantization. Vector quantization | Simulations of quantizers followed by lossless encoders |

11 | Linear prediction based speech coding. Demos | Matlab implementation of main stages of speech coder |

12 | Psycho-acoustic models and filterbanks. Standards. Demos | Implementation of subband decomposition via overlapped harmonic transforms |

13 | Formats of image data. JPEG | Implementation issues |

14 | Discrete wavelet trandsform. JPEG-2000 | Implementation example |

15 | Formats of video data. Motion compensation. MPEG standards | Simulations. Demos. |

16 | Horizons. Open problems. Theoretical research topics | Summary of projects |

Topic | Content | Start (week) | Deadline (week) |

Measuring information | For a real discrete data source estimate theoretically achievable compression ratio. Compare with the efficiency of existing archivers | 1 | 6 |

Universal data compression | Implement programs for estimating the efficiency of a given set of universal data compression algorithms. Choose the best for a given data source. Compare with efficiency of the standard archivers | 5 | 13 |

Analog data compression | Choose a multimedia data fragment. Evaluate rate-distortion trade-off using straightforward scalar quantization followed by lossless encoding. Repeat computations using a proper transform | 10 | 17 |

### Schedule:

Lecture Monday 10.15-12.00 weeks (24-39), Delta, 2047

Practical session Wed. 10.15 - 12.00 week (24-39) Delta, 2045

# Demos

## Speech compression standards

The original speech file (PCM 8kHz, 16 bits, size 73684 bytes

ADPCM, 8kHz, 4 bits, size 18688 bytes

GSM Standard, CELP-like coder, size 7540 bytes

CELP-like coder, size 4896 bytes

## Image compression standards

Original image file, size 3145782 bytes

JPEG Standard, size 31016 bytes

JPEG2000, size 31016 bytes

Wavelet coder, size 29401 bytes