View
218
Download
0
Category
Preview:
Citation preview
8/3/2019 Pres 3May 5May 9871
1/11
9871: Information Theory and CodingMay 3, 2011
INFORMATION THEORY and CODING
Founded by Shannon in 1948.
This theory is for transmission/ recording over/in a channel.
The channel can be wireless or wire channel (copper telephone or fiber optic
cables), magnetic or optical disks.
There are three aspects:
Efficiency (Compression)
Reliability (Error Detection and Correction)
Security (Cryptography)
Information Theory is based on the Probability Theory.
Coding Theory makes use of finite fields, linear algebra, combinational design,etc.
8/3/2019 Pres 3May 5May 9871
2/11
9871: Information Theory and CodingMay 3, 2011
Shannons Information Theory and
Coding
Sent
messages
source receiver
Received
messages
channel
symbols
0110101001110
encoder decoder
Source
coding
Channel
coding
Source
decoding
Channel
decoding
Error Detection andCorrectionChannel Capacity
Compression
Source Entropy
Decompression
8/3/2019 Pres 3May 5May 9871
3/11
9871: Information Theory and CodingMay 3, 2011
Overview of Digital Communication and Storage Systems
InformationSource
Encoder Channel
DecoderInformationSink
m
c
rm
GOAL: Information conveyed through (or stored in) the channel must bereproduced at the destination as reliable as possible. At the same time, it needsto allow the transmission of as much information as possible per unit time(communication system) or storage (storage system).
Channel: produces a received signal r which differs from the original signal, c(the channel introduces noise, channel distortion, etc.)
The decoder can only produce an estimate m of the original message, m.
8/3/2019 Pres 3May 5May 9871
4/11
9871: Information Theory and CodingMay 3, 2011
InformationSource
The source message m consists of a time sequence of symbols emitted by
the information source.
The source can be:
Continuous-time Source, if this message is continuous time, e.g.,speech waveform.
Discrete-time Source, if the message is discrete-time, e.g., datasequences from a computer.
The symbols emitted by the source can be:
Continuous-amplitude, e.g., speech waveform.
Discrete-amplitude, e.g., text with a finite symbol alphabet.
This course primarily concerns with discrete- time and discrete-amplitudesources, as practically all new communication or storage systems fall into thiscategory.
8/3/2019 Pres 3May 5May 9871
5/11
9871: Information Theory and CodingMay 3, 2011
DISCRETE SOURCES AND ENTROPY
Discrete Information Sources and Entropy
Source Alphabets and Entropy
Joint and Conditional Entropy
Entropy of Symbol Blocks and the Chain Rule
Source Coding
Huffman Coding
Dictionary Codes and Lempel-Ziv Coding
Arithmetic Coding
8/3/2019 Pres 3May 5May 9871
6/11
9871: Information Theory and CodingMay 3, 2011
Source Alphabet and Entropy
The Information Theory is based on the Probability Theory, as the term
information carries with it a connotation of UNPREDICTABILITY (SURPRISE) inthe transmitted signal.
The Information Source is defined by :
- The set of output symbols
- The probability rules which govern the emission of these symbols.
Finite-Discrete Source: finite number of unique symbols.
The symbol set is called the Source Alphabet.
8/3/2019 Pres 3May 5May 9871
7/11
9871: Information Theory and CodingMay 3, 2011
Source Alphabet and Entropy (contd)
A is an alphabetwith Mpossible symbols,
We can say that the emitted symbol is a random variable, which takes values in A.
The number of elements in a set is called its cardinality, e.g., M=|A|.
The source output symbols are:
where is the symbol emitted by the source at time t. Note that here t is aninteger time index.
Stationary Source: the set of probabilities is not a function of time.
At any given time moment, the probability that the source emits is
Probability mass function:
0 1( , ... .....)ta s s s
ts
ma Pr( )m mp a
0 1 1{ , ,..., }A MP p p p
0 1 1{ .... }MA a a a
Definitions:
Since the source emits only members of its alphabet, then1
01
M
mmp
8/3/2019 Pres 3May 5May 9871
8/11
9871: Information Theory and CodingMay 3, 2011
Source Alphabet and Entropy (contd)
Information Sources: Classification
Stationary Versus Non-Stationary Source:
For a stationary source the set of probabilities is not a function of time,whereas for a non-stationary source it is.
Synchronous Source Versus Asynchronous Source:
A synchronous source emits a new symbol at a fixed time interval, Ts,whereas for an asynchronous source the interval between emittedsymbols is not fixed.
The latter can be approximated as synchronous, by defining a nullcharacter when the source does not emit at time t. We say the sourceemits a null character at time t.
8/3/2019 Pres 3May 5May 9871
9/11
9871: Information Theory and CodingMay 3, 2011
Source Alphabet and Entropy (contd)
Representation of the Source Symbols
-In digital systems, the binary representation is used.
Distinction between Data and Information
Example: An information source has an alphabet with only 1 symbol. This symbol
is data, but this data is not information, as it is completely uninformative. Sinceinformation carries the connotation of uncertainty, the information content of thissource is zero.
HOW CAN ONE MEASURE THE INFORMATION CONTENT OF A SOURCE?
Pop Quiz:How many bits to represent the symbols 1, 2, 3, 4?or in a symbol ofnsymbols 1, 2, 3, , n?
The symbols are referred to as Source Data.
Answer:2 bits, and log2n bits, respectively.
4=2log24, n=2log2 n
8/3/2019 Pres 3May 5May 9871
10/11
9871: Information Theory and CodingMay 3, 2011
Source Alphabet and Entropy (contd)
Each transmitted Symbol 1 is just one choice out of 1/p1
many possible choices and therefore Symbol 1 containslog2 1/p1bits information (1/p1= 2
log21/p1)
Similarly, Symbol k contains log2 1/pkbits information
The average information bits per symbol for our source isH(A) =p1log2 1/p1++pnlog2 1/pn
Example:Pick a marble from
a bag of 2 blue, and5 read marbles
Probability for pickinga red marble:
pred= 5/7 Number of choices for each red picked
1 /pred
= 7/5 =1.4
Shannon gave this precise mathematical definitionof the average amount of information conveyedper source symbol, used to measure theinformation content of a source.
8/3/2019 Pres 3May 5May 9871
11/11
9871: Information Theory and CodingMay 3, 2011
Source Alphabet and Entropy (contd)
Entropy of a Source
Unit of Measure: bit
M is the cardinality of the source A.
Range:20 ( ) logH A M 1
1,...,
mpM
m M
Measurement of the Information Efficiency of the Source
RATIO OF THE ENTROPY OF THE SOURCE TO THE (AVERAGE) NUMBEROF BINARY DIGITS USED TO REPRESENT THE SOURCE DATA.
When the entropy of the source is lower than the (average) number of bits usedto represent the source data, an efficient coding scheme can be used toencode the source information, using, an average, fewer binary digits.This is called Data Compression and the encoder used for that is calledSource Encoder.
Recommended