07_toeplitz

7/26/2019 07_toeplitz

1/8

Statistics 910, #7 1

Eigenvectors and Eigenvalues of Stationary Processes

Overview

1. Toeplitz matrices

2. Szegos theorem

3. Circulant matrices

4. Matrix Norms

5. Approximation via circulants

Toeplitz and circulant matrices

Toeplitz matrix A banded, square matrix n (subscript n for the n n

matrix) with elements [n]jk =jk,

n =

0 1 2 1n1 0 1 2n

2 1 0. . .

......

. . . . . . 1

n1 n2 1 0

(1)

Symmetry Toeplitz matrices dont have to be symmetric or real-valued,

but ours will be since well set h = h = Cov(Xt+h, Xt) for some

stationary process Xt. From now on, n is the covariance matrix of a

stationary process.

Assumptions We will assume that the covariances are absolutely summable,h

|h|<

Breadth The results shown here are in the form thats most accessible,

without searching for too much generality. All of these extend more

generally to other types of sequences, such as those that are square

summable, and other matrices that need not be symmetric.

7/26/2019 07_toeplitz

2/8


Szegos Theorem

Fourier transform Since the covariances are absolutely summable, it is

(relatively) easy to show that we can define a continuous function

from the h,

f() =

h=

hei2h , 12 <

12 . (2)

Ifhare the covariances of a stationary process, then f() is known as

the spectral density functionor spectrum of the process. (See Section

4.1-4.3 in SS.)

Inverse transform The Fourier transform is invertible in the sense that

we can recover the sequence of covariances from the spectral density

functionf() by integration

h =

1/21/2

f()ei2hd . (3)

Heuristically, the expression for the variance 0 =1/21/2 f()d sug-

gests that the spectral density decomposes the variance of the process

into a continuous mix of frequencies.

Szegos theorem Define the eigenvalues of n as

n,0, n,1, . . . , n,n1.

These are all positive if n is positive definite (as we often require

or assume). Szegos theorem shows that we can use the spectum to

approximate various various functions of the eigenvalues. (An inter-

esting question in the analysis of Toeplitz matrices in general is what

happens when n is not full rank.)

LetGdenote an arbitrary continuous function. Then

limn

1n

n1h=0

G(n,h) = 1/2

1/2G( f() )d (4)

That is, we get to replace a sum by an integral. The sum resembles a

Riemann approximation to an integral, as ifn,h = f(2h/n).

7/26/2019 07_toeplitz

3/8


Basic examples

Trace The trace of a square matrix is the sum of the diagonal values

of the matrix, which equals the sum of the eigenvalues of the

matrix.

1

ntrace(n) =

1

n

h

n,h

1/21/2

f()d

Though it follows directly from (3) that 0 =1/21/2 f()d, it is

also a consequence of (4) as well.

Determinant The product of the eigenvalues of a matrix.

log |n|1/n = 1

n

log n,h

1/2

1/2log f()d .

Prediction application Heres a surprising application of Szegos theorem

to prove a theorem of Kolmogorov. IfXt is a staionary process, how

well is it possible to predict Xn+1 linearly from its n predecessors? In

particular, what is

Vn = mina

E (Xn+1 a0Xn a2Xn1 an1X1)2

In the Gaussian case, the problem is equalvalent to finding the variance

of the conditional expectation ofXn+1 given X1:n. Szegos theorem

provides an elegant answer:

limn

Vn = exp

1/21/2

log f()d

. (5)

The answer follows from applying (4) to the ratio of determinants,

once you note that

Vn = |n+1|

|n| . (6)

(Long aside: To see that (6) holds, covariance manipulations of the

multivariate normal distribution show you that the variance of the

scalar r.v. Ygiven the vector r.v. X is Var(Y|X) =yy yx1xx xy,where the variance matrix is partitioned as

= Var

Y

X

=

yy yxxy xx

.

7/26/2019 07_toeplitz

4/8


To find the determinant ||, postmultiply by a matrix with 1s on

the diagonal, obtaining

||=

1 0

1xx xy I

= (yy yx1xx xy)|xx|The expression (6) follows. Note that the regression vector =

1xx xy).

Now back to the prediction problem. From Szegos theorem, 1n+1

log n+1,j 1n

log n,j since both approximate the same integral. Now plug in

|n+1|1/(n+1) |n|

1/n and (5) follows.

Circulant matrices

Circulant matrix is a special type of Toeplitz matrix constructed by ro-

tating a vector c0, c1, . . . , cn1, say, cyclically by one position to fill

successive rows of a matrix,

Cn =

c0 c1 c2 cn1cn1 c0 c1 cn2cn2 cn1 c0 c1

... . . .

. . . c1c1 c2 cn1 c0

(7)

Circulant matrices are an important type of Toeplitz matrix for our

purpose (they have others!) because we can easily find their eigenvec-

tors and eigenvalues.

Eigenvectors These are relatively easy to find from the basic definition of

an eigenvector and a great guess for the answer. An eigenvectoru of

Cn satisfies Cn u= u, which gives a system ofn equations:

c0u0+ c1u1+ + cn1un1 = u0

cn1u0+ c0u1+ + cn2un1 = u1...

...

c1u0+ c2u1+ + c0 un1 = un1

7/26/2019 07_toeplitz

5/8


The equation in the rth row is, in modular form,

rowr:

nr1j=0

cjuj+r+

n1j=nr

cju(j+r)|n = ur

or in more conventional form as

rowr :nr1j=0

cjuj+r+n1

j=nr

cjuj+rn = ur .

Guessuj =j (maybe reasonable to guess this if you have been study-

ing differential equations.). Then

nr1j=0

cjj+r +

n1j=nr

cjj+rn = r .

Ifn = 1, it works! These are the n roots of unity; any = exp(i2j/n)

works for j = 0, 1, . . . , n 1. The eigenvectors have the form u =

(0, 1, 2, . . . , n1):

ur = (1, ei2r/n, ei22r/n, . . . , ei2(n1)r/n)

as seen in the discrete Fourier transform.

Eigenvalues These are the discrete Fourier transforms of the sequence that

defines the circulant,

n,k =

n1j=0

cjei2jk/n

If thecj =j , then weve got the first n terms of the sum that defines

the spectral density (2).

Implications for covariance matrices We can now anticipate the re-

sults. Asymptotically in n, the vectors that define the discrete Fourier

transform are eigenvectors ofeverycovariance matrix. The eigenvalues

are then the transform coefficients of the covariances, namely valuesof the spectral density function. If we define an orthogonal matrix

U= (u0, u1, . . . , un1) from the eigenvectors un, then we obtain

U n Udiag(f(j))

7/26/2019 07_toeplitz

6/8


To confirm these guesses, we need to show that a circulant matrix

provides a good approximation to the covariance, good enough sothat we can use the results for circulants when describing covariance

matrices. To describe what we mean by a good approximation, we

need a way to measure the distance between matrices, a norm.

Matrix norms

Norms A norm on a vector space Vis any functionxfor x Vfor which

( R)

1. x> 0, x = 0.

2. x= ||x

3. x + y x + y(triangle inequality)

Operator norm Define the operator norm of a square matrix T as the

maximum ratio of quadratic forms,

T= maxx

xT x

xx = max j

wherej are the eigenvalues (singular values) ofT.

Hilbert-Schmidt or weak norm. For an n n matrix T,

|T|2 = 1n

j,k

|tjk |2

= 1ntrace(TT)

= 1n

2j .

Connections It is indeed a weaker norm,

|T|2 T2 =2max

The two allow you to handle products,

|T S| T|S|.

7/26/2019 07_toeplitz

7/8


Approximation via circulants

An approximation Consider the following circulant that approximates

n, obtained by flipping the covariances and running them both

ways:

Gn=

0 1+ n1 2+ n2 n1+ 11+ n1 0 1+ n1 n2+ 2

2+ n2 1+ n1 0. . .

......

. . . . . . 1

n1+ 1 n2+ 2 1+ n1 0

(8)

The norm of the difference n Gn is a familiar type of sum,

|n Gn|2 = 2n

n1h=1

(n h)2nh

= 2n

n1h=1

h 2h reverse the sum

= 2n1h=1

hn

2h .

Since we assume that |j|< , the sum converges to 0,limn

|n Gn|= 0.

Eigenvalues of close matrices If two matrices are close in the weak norm,

then the averages of their eigenvalues are close. In particular, given

two matrices Aand B with eigenvalues j and j, then

| 1n

j

j 1n

j

j| |A B|

Proof: IfD = A B, then

j

j j

j = trace(A) trace(B) = trace(D).

Now use Cauchy-Schwarz in the form

(j

aj)2 = (

j

aj 1)2 (

j

a2j )(n) (9)

7/26/2019 07_toeplitz

8/8


to show that

|trace(D)|2 =|j

djj |2 n

j

d2jj nj,k

d2jk =n2|D|2 .

The loose bounds in the proof suggest you can do a lot better. In fact,

you can move the absolute value inside the sum (so that the actual

eigenvalues are getting close, not just on average).

Powers of eigenvalues A messier argument produces a similar result. Given

two matrices A and B with |A B| 0 and eigenvalues j and j,

then for any power s,

lim 1nj

(sj sj ) = 0 .

Extension Now that weve found that powers of the eigenvalues converge

for close matrices in the limit, its not hard to get the result for a

continuous function. The argument is a common one: polynomials

can approximate any continuous function g. The result implies that

lim 1n

j

g(j) g(j) = 0.

Documents

07_toeplitz