Pres 3May 5May 9871

8/3/2019 Pres 3May 5May 9871

1/11

9871: Information Theory and CodingMay 3, 2011

INFORMATION THEORY and CODING

Founded by Shannon in 1948.

This theory is for transmission/ recording over/in a channel.

The channel can be wireless or wire channel (copper telephone or fiber optic

cables), magnetic or optical disks.

There are three aspects:

Efficiency (Compression)

Reliability (Error Detection and Correction)

Security (Cryptography)

Information Theory is based on the Probability Theory.

Coding Theory makes use of finite fields, linear algebra, combinational design,etc.

8/3/2019 Pres 3May 5May 9871

2/11


Shannons Information Theory and

Coding

Sent

messages

source receiver

Received

messages

channel

symbols

0110101001110

encoder decoder

Source

coding

Channel

coding

Source

decoding

Channel

decoding

Error Detection andCorrectionChannel Capacity

Compression

Source Entropy

Decompression

8/3/2019 Pres 3May 5May 9871

3/11


Overview of Digital Communication and Storage Systems

InformationSource

Encoder Channel

DecoderInformationSink

m

c

rm

GOAL: Information conveyed through (or stored in) the channel must bereproduced at the destination as reliable as possible. At the same time, it needsto allow the transmission of as much information as possible per unit time(communication system) or storage (storage system).

Channel: produces a received signal r which differs from the original signal, c(the channel introduces noise, channel distortion, etc.)

The decoder can only produce an estimate m of the original message, m.

8/3/2019 Pres 3May 5May 9871

4/11


InformationSource

The source message m consists of a time sequence of symbols emitted by

the information source.

The source can be:

Continuous-time Source, if this message is continuous time, e.g.,speech waveform.

Discrete-time Source, if the message is discrete-time, e.g., datasequences from a computer.

The symbols emitted by the source can be:

Continuous-amplitude, e.g., speech waveform.

Discrete-amplitude, e.g., text with a finite symbol alphabet.

This course primarily concerns with discrete- time and discrete-amplitudesources, as practically all new communication or storage systems fall into thiscategory.

8/3/2019 Pres 3May 5May 9871

5/11


DISCRETE SOURCES AND ENTROPY

Discrete Information Sources and Entropy

Source Alphabets and Entropy

Joint and Conditional Entropy

Entropy of Symbol Blocks and the Chain Rule

Source Coding

Huffman Coding

Dictionary Codes and Lempel-Ziv Coding

Arithmetic Coding

8/3/2019 Pres 3May 5May 9871

6/11


Source Alphabet and Entropy

The Information Theory is based on the Probability Theory, as the term

information carries with it a connotation of UNPREDICTABILITY (SURPRISE) inthe transmitted signal.

The Information Source is defined by :

- The set of output symbols

- The probability rules which govern the emission of these symbols.

Finite-Discrete Source: finite number of unique symbols.

The symbol set is called the Source Alphabet.

8/3/2019 Pres 3May 5May 9871

7/11


Source Alphabet and Entropy (contd)

A is an alphabetwith Mpossible symbols,

We can say that the emitted symbol is a random variable, which takes values in A.

The number of elements in a set is called its cardinality, e.g., M=|A|.

The source output symbols are:

where is the symbol emitted by the source at time t. Note that here t is aninteger time index.

Stationary Source: the set of probabilities is not a function of time.

At any given time moment, the probability that the source emits is

Probability mass function:

0 1( , ... .....)ta s s s

ts

ma Pr( )m mp a

0 1 1{ , ,..., }A MP p p p

0 1 1{ .... }MA a a a

Definitions:

Since the source emits only members of its alphabet, then1

01

M

mmp

8/3/2019 Pres 3May 5May 9871

8/11



Information Sources: Classification

Stationary Versus Non-Stationary Source:

For a stationary source the set of probabilities is not a function of time,whereas for a non-stationary source it is.

Synchronous Source Versus Asynchronous Source:

A synchronous source emits a new symbol at a fixed time interval, Ts,whereas for an asynchronous source the interval between emittedsymbols is not fixed.

The latter can be approximated as synchronous, by defining a nullcharacter when the source does not emit at time t. We say the sourceemits a null character at time t.

8/3/2019 Pres 3May 5May 9871

9/11



Representation of the Source Symbols

-In digital systems, the binary representation is used.

Distinction between Data and Information

Example: An information source has an alphabet with only 1 symbol. This symbol

is data, but this data is not information, as it is completely uninformative. Sinceinformation carries the connotation of uncertainty, the information content of thissource is zero.

HOW CAN ONE MEASURE THE INFORMATION CONTENT OF A SOURCE?

Pop Quiz:How many bits to represent the symbols 1, 2, 3, 4?or in a symbol ofnsymbols 1, 2, 3, , n?

The symbols are referred to as Source Data.

Answer:2 bits, and log2n bits, respectively.

4=2log24, n=2log2 n

8/3/2019 Pres 3May 5May 9871

10/11



Each transmitted Symbol 1 is just one choice out of 1/p1

many possible choices and therefore Symbol 1 containslog2 1/p1bits information (1/p1= 2

log21/p1)

Similarly, Symbol k contains log2 1/pkbits information

The average information bits per symbol for our source isH(A) =p1log2 1/p1++pnlog2 1/pn

Example:Pick a marble from

a bag of 2 blue, and5 read marbles

Probability for pickinga red marble:

pred= 5/7 Number of choices for each red picked

1 /pred

= 7/5 =1.4

Shannon gave this precise mathematical definitionof the average amount of information conveyedper source symbol, used to measure theinformation content of a source.

8/3/2019 Pres 3May 5May 9871

11/11



Entropy of a Source

Unit of Measure: bit

M is the cardinality of the source A.

Range:20 ( ) logH A M 1

1,...,

mpM

m M

Measurement of the Information Efficiency of the Source

RATIO OF THE ENTROPY OF THE SOURCE TO THE (AVERAGE) NUMBEROF BINARY DIGITS USED TO REPRESENT THE SOURCE DATA.

When the entropy of the source is lower than the (average) number of bits usedto represent the source data, an efficient coding scheme can be used toencode the source information, using, an average, fewer binary digits.This is called Data Compression and the encoder used for that is calledSource Encoder.

Pres 3May 5May 9871

Documents

final pres

Pres Final

Bcg12 Pres

Matematicas1 Pres

Rdos12t3 Pres

Kalialei Pres

Media pres

Warren Pres

Pres club f

Pres taller

Pres RAEPjan2013

Acheter Pres

planches pres

Pres. final

NVMemory2009 Pres

Gruppo03 pres

Pres 12042010

w3gtcp Pres

Pres m.goldberg

Pres branchesinfinies