# Introduction to Information Theory

**Program:** Bachelor Mathematics, Informatics, AI, Physics & Astronomy (WI, WInf, IN, KI, NS)

**Lecturer:** Michael Walter

**Teaching assistants:** Freek Witteveen

**Schedule:** Sept-Oct 2019

**Further information:** DataNose,
Canvas

This is the official course homepage. All material will be provided here.

## Course description

Reading your email on the metro may not strike you as particularly memorable. Isn’t it remarkable, though, that – despite relying on a noisy cellphone signal underground – none of your messages arrive corrupted? In this course you will learn the mathematics of how this is achieved.

This course will give an introduction to *information theory* – the mathematical theory of information. Ever since its inception in the 50s, information theory has had a profound impact on society. It underpins important technological developments, from reliable memories to mobile phone standards, and its versatile mathematical toolbox has found use in computer science, machine learning, physics, and even pure mathematics.

Starting from probability theory, we will discuss how to mathematically model *information sources* and *communication channels*, how to optimally *compress* and *transmit* information, and how to design *error-correcting codes* that allow us to reliably communicate over noisy communication channels. We will also see how techniques used in information theory can be applied more generally to *make predictions* from noisy data.

Students interested in writing a thesis in quantum information theory are also encouraged to follow Maris Ozol’s course Introduction to Quantum Computing.

## Prerequisites

We will use rigorous definitions and proofs, hence some “mathematical maturity” will be assumed. Familiarity with *basic probability theory* will be expected (but we will remind you of the most important facts). In addition, some of the homework problems will require *programming*. You can use the programming language of your choice; examples and solution will be given in Python.

## Lecture notes

- Lecture 1 (Sept 3, 2019): Welcome, Introduction to Information Theory [✍] [▶]
- Lecture 2 (Sept 5, 2019): Probability Theory Refresher [✍] [▶]
- Lecture 3 (Sept 10, 2019): Entropy and the Source Coding Theorem [✍] [▶]
- Lecture 4 (Sept 12, 2019): The Source Coding Theorem: Proof and Variations [✍] [▶]
- Lecture 5 (Sept 17, 2019): Symbol Codes: Huffman Algorithm [✍] [▶]
- Lecture 6 (Sept 19, 2019): Stream Codes: Lempel-Ziv Algorithm [✍] [▶]
- Lecture 7 (Sept 24, 2019): Stream Codes: Arithmetic Coding [✍] [▶]
- Lecture 8 (Sept 26, 2019): Joint Entropies & Communication over Noisy Channels [✍] [▶]
- Lecture 9 (Oct 1, 2019): The Noisy Coding Theorem [✍] [▶]
- Lecture 10 (Oct 3, 2019): Proof of the Noisy Coding Theorem [✍] [▶]
- Lecture 11 (Oct 8, 2019): Proof of the Converse, Theory vs Practice, Reed-Solomon Codes [✍] [▶]
- Lecture 12 (Oct 10, 2019): Decoding Reed-Solomon Codes, Message Passing Algorithms [✍] [▶]
- Lecture 13 (Oct 15, 2019): Linear Codes, LDPC Codes, Decoding by Message Passing [✍] [▶]
- Lecture 14 (Oct 17, 2019): Message Passing for Decoding and Inference, Outlook [✍] [▶]

A playlist of all lecture recordings can be founded here. 🎞

Main reference is David J. C. MacKay’s textbook *“Information Theory, Inference, and Learning Algorithms”*, Cambridge (2003), which is available online. 📖

## Homework

- Problem set 1 [pdf] [colab]
- Problem set 2 [pdf] [colab]
- Problem set 3 [pdf] [colab]
- Problem set 4 [pdf] [colab]
- Problem set 5 [pdf] [colab]
- Problem set 6 [pdf] [colab]

## Exercises classes

All exercises refer to MacKay’s textbook:

- Exercise class 1 (Sept 4, 2019): 1.3, 1.5, 1.6, 1.7, 1.10, optimality of the repetition code decoder
- Exercise class 2 (Sept 6, 2019): 2.2, 2.16, 2.14, 2.23, 2.25, 3.8, 3.9, 3.12, 4.1; 13.24 (just for fun)
- Exercise class 3 (Sept 11, 2019): [pdf] [ipynb]
- Exercise class 4 (Sept 13, 2019): [pdf]
- Exercise class 5 (Sept 18, 2019): 5.8, 5.21 (only for X^2, X^3), 5.25, 5.31, [pdf]
- Exercise class 6 (Sept 20, 2019): 6.5, 6.6, [pdf]
- Exercise class 7 (Sept 25, 2019): 6.7, [pdf]
- Exercise class 8 (Sept 27, 2019): 8.4, 9.7, 9.8, 9.12, 9.13, [pdf]
- Exercise class 9 (Oct 2, 2019): 9.12, 9.13, 9.20, 9.21, [pdf]
- Exercise class 10 (Oct 4, 2019): [pdf]
- Exercise class 11 (Oct 9, 2019): [pdf]
- Exercise class 12 (Oct 11, 2019): 16.1, [pdf]
- Exercise class 13 (Oct 16, 2019): 26.8, 26.2, 26.3, 47.3
- Exercise class 14 (Oct 18, 2019): 15.1, 15.2, 15.3, 15.8, 15.9, 15.10, 15.11, 15.13, [pdf] – last practice before the exam!

## Learning Objectives

At the end of the course, you will be able to:

- mathematically model memoryless and Markov information sources using probability theory
- mathematically model memoryless information channels using probability theory
- compute basic properties of probability distributions, such as entropies and essential bit contents, using mathematical techniques such as Jensen’s inequality
- define typical sets and estimate their size and probability
- prove and apply Shannon’s source coding theorem
- prove and apply the Kraft inequality for uniquely decodable (UD) codes, its converse for prefix codes, and use it to bound the average length for optimal UD or prefix codes
- design optimal symbol codes using Huffman’s coding algorithm
- analyze and apply the Lempel-Ziv algorithm for stream coding
- analyze and apply arithmetic coding for different probabilistic models
- compute basic properties of joint probability distributions, such as conditional entropies and mutual informations
- prove and apply Shannon’s noisy channel coding theorem
- understand basic properties of block codes, such as their rate and various probabilities of error
- encode Reed-Solomon codes and decode them for erasure errors
- understand maximum likelihood approach to codeword and bitwise decoding
- design and apply message passing algorithms for encoding and inference problems

## Format

Lectures, exercises classes, self-study.

## Assessment

The final grade will be determined by the *maximum* of the following two options:

- 60% exam grade + 40% homework grade,
- 100% exam grade.

In addition, your exam grade alone has to be a passing grade. The same rule applies for the re-sit exam.

There will be one homework problem set per week (in total 6), posted on the course homepage by Friday. You must submit your completed homework before the exercise class on Friday the week after – either directly in the exercise class or by email to Freek Witteveen. The solutions will be discussed in the Friday exercise class (among other things).

If you have extenuating circumstances (such as sickness or family emergency) you must confirm with the lecturer before the deadline.

Everything that we discussed in class and on the homework is in principle in scope for the exam. We recommend that you revisit the homework problems and the exercises as a preparation. You are allowed to bring one self-prepared “cheat sheet” to the exam (A4 paper, hand-written, you can use both sides).