Lec23 Semidef Opt

7/26/2019 Lec23 Semidef Opt

1/54

Introduction to Semidefinite Programming (SDP)

Robert M. Freund

March, 2004

1

2004 Massachusetts Institute of Technology.


2/54

1 Introduction

Semidefiniteprogramming(SDP)isthemostexcitingdevelopmentinmath-ematicalprogramminginthe1990s. SDP hasapplications insuchdiversefields as traditional convex constrained optimization, control theory, andcombinatorial optimization. Because SDP is solvable via interior pointmethods, most of these applications can usually be solved very efficientlyinpracticeaswellas intheory.

2 Review of Linear Programming

Considerthe linearprogrammingproblem instandardform:

LP : minimize c x

s.t. ai

x=bi, i = 1, . . . , m nx +.

Herexisavectorofnvariables,andwewritec xforthe inner-product

n

j=1 cjxj,etc.

n n

nAlso,+ :={x | x 0}, and we call + thenonnegativeorthant.Infact,n isaclosed convex cone, whereK iscalledaclosedaconvexcone+ifK satisfiesthefollowingtwoconditions:

Ifx,w K,thenx+w K forallnonnegativescalarsand.

K

is

a

closed

set.

2


3/54

Inwords,LP isthefollowingproblem:

Minimize the linear function c x, subject to the condition that xmust solvemgiven equations ai

x=bi, i = 1, . . . , m, and that xmust lie in the closednconvex cone K=+.

Wewillwritethestandard linearprogrammingdualproblemas:

m

LD: maximize yibii=1

m

s.t. yiai+s=ci=1

ns +.

Given a feasible solution x of LP and a feasible solution (y,s) of LD, them mdualitygapissimplyc x i=1 yibi = (c x=s x 0,becausei=1 yiai)

x 0 and s 0. WeknowfromLP dualitytheorythatso longasthepri-malproblemLP is feasibleandhasboundedoptimalobjectivevalue,then

the

primal

and

the

dual

both

attain

their

optima

with

no

duality

gap.

That

is,thereexistsx and(y , s )feasiblefortheprimalanddual,respectively,

m suchthatc x i=1 yibi =s x = 0.

3 Facts about Matrices and the Semidefinite Cone

IfX isann nmatrix,thenX isapositivesemidefinite(psd)matrixif

vTXv 0 forany v n.

3


4/54

IfX isann nmatrix,thenX isapositivedefinite(pd)matrix if

vTXv>0 forany v n, v = 0.

Let Sn denote the set of symmetric nn matrices, and let nS+ denotethe set of positive semidefinite (psd) n n symmetric matrices. SimilarlyletSn denotethesetofpositivedefinite(pd)n nsymmetricmatrices.++

LetX andY beanysymmetricmatrices. WewriteX 0todenotethat X is symmetric and positive semidefinite, and we write X Y to denotethatX Y 0. WewriteX 0todenotethatX issymmetricandpositivedefinite,etc.

| nX 0} is a closed convex cone in 2 ofRemark 1 {X Snn (n+ 1)/2.

nS+ =dimension

Toseewhythisremark istrue,supposethat n

X,W S+. Pickanyscalars

,

0.

For

any

v

n, we have:

vT(X+W)v=vTXv+vTW v 0,

whereby n+ X W S + Thisshowsthat nS+ isacone. It isalsostraight-.forwardtoshowthat nS+ isaclosedset.

Recallthefollowingpropertiesofsymmetricmatrices:

If X Sn, then X = QDQT for some orthonormal matrix Q andsome diagonal matrix D. (Recall that Q is orthonormal means that

4


5/54

Q1 =QT,andthatD isdiagonalmeansthattheoff-diagonalentriesof

D

are

all

zeros.)

If X = QDQT as above, then the columns of Q form a set of northogonaleigenvectorsofX,whoseeigenvaluesarethecorrespondingdiagonalentriesofD.

X 0 if and only if X = QDQT where the eigenvalues (i.e., thediagonalentriesofD)areallnonnegative.

X 0 if and only if X = QDQT where the eigenvalues (i.e., the

diagonalentriesofD)areallpositive.

If

X

0 and

if

Xii =

0,

then

Xij =

Xji = 0

for

all

j

= 1, . . . , n.

ConsiderthematrixM definedasfollows:

P vM = Tv d

,

where P 0, v is a vector, and d is a scalar. Then M 0 if and TP1only ifd v v > 0.

4 Semidefinite Programming

LetXSn. WecanthinkofX asamatrix,orequivalently,asanarrayofn2 components of the form (x11, . . . , xnn). We can alsojust think of X asanobject(avector) inthespaceSn. AllthreedifferentequivalentwaysoflookingatX willbeuseful.

Whatwilla linearfunctionofX look like? IfC(X) isa linear function

of

X,

then

C(X)

can

be

written

as

C

X,

where

5


6/54

n n

CX := Cij

Xij

.i=1 j=1

IfX isa symmetricmatrix,there isno lossof generality inassumingthatthe matrix C is also symmetric. With this notation, we are now ready todefine a semidefinite program. A semidefinite program (SDP) is an opti-mizationproblemoftheform:

SDP : minimize CX

s.t. AiX=bi , i = 1, . . . , m ,

X0,

Notice that in an SDP that the variable is the matrix X, but it mightbe helpful to think of X as an array of n2 numbers or simply as a vectorinSn. Theobjective function isthe linearfunctionCX andtherearemlinear equations that X must satisfy, namely Ai

X = bi

, i = 1, . . . , m.ThevariableXalsomustlieinthe(closedconvex)coneofpositivesemidef-inite symmetric matrices Sn

+. Note that the data for SDP consists of the

symmetric

matrix

C

(which

is

the

data

for

the

objective

function)

and

the

msymmetricmatricesA1, . . . , Am,andthemvectorb,whichformthemlinearequations.

Let us see an example of an SDP for n = 3 and m = 2. Define thefollowingmatrices:

1 0 1 0 2 8 1 2 3

A1 =0 3 7 , A2 =2 6 0 , and C=2 9 0 ,1 7 5

8 0 4

3 0 7

6


7/54

andb1 = 11 and b2 =19. ThenthevariableX willbethe33symmetricmatrix:

x11 x12 x13X=x21 x22 x23 ,

x31 x32 x33

andso,forexample,

CX = x11 + 2x12 + 3x13 + 2x21 + 9x22 + 0x23 + 3x31 + 0x32 + 7x33

= x11 + 4x12 + 6x13 + 9x22 + 0x23 + 7x33.

since, in particular, X is symmetric. Therefore the SDP can be writtenas:

SDP : minimize x11 + 4x12 + 6x13 + 9x22 + 0x23 + 7x33

s.t.

x11 + 0x12 + 2x13 + 3x22 + 14x23 + 5x33 = 11

0x11 + 4x12 + 16x13 + 6x22 + 0x23 + 4x33 = 19

x11 x12 x13X=x21 x22 x23 0.

x31 x32 x33

Notice that SDP looks remarkably similar to a linear program. However,

the

standard

LP

constraint

that

x

must

lie

in

the

nonnegative

orthant

is

re-placedbytheconstraintthatthevariableX mustlieintheconeofpositivesemidefinitematrices. Justasx0statesthateachofthencomponents

7


8/54

ofxmustbenonnegative, itmaybehelpfultothinkofX

0asstatingthat

each

of

the

n

eigenvalues of

X

must

be

nonnegative.

It is easy to see that a linear program LP is a special instance of anSDP. Toseeonewayofdoingthis,supposethat(c, a1, . . . , am, b1, . . . , bm)comprisethedataforLP. Thendefine:

ai1 0 . . . 0 c1 0 . . . 0

0

ai2 . . . 0

0 c2 . . . 0

Ai = . . . . , i = 1, . . . , m , and C= . . . . . ..

. .. .

. . . .. .

. .

. .0 0 . . . ain 0 0 . . . cn

ThenLP canbewrittenas:

SDP : minimize CX

s.t. Ai

X=bi

, i = 1, . . . , m , Xij = 0, i = 1, . . . , n , j =i + 1, . . . , n , X

0,

withtheassociationthat

x1 0 . . . 0 0 x2 . . . 0

X=

. . .. .

.. . . . .. .0 0 . . . xn

Of course, in practice one would neverwant to convert an instance of LPintoan instanceofSDP. TheaboveconstructionmerelyshowsthatSDP

8


9/54

includes linearprogrammingasaspecialcase.

5 Semidefinite Programming Duality

ThedualproblemofSDP isdefined(orderivedfromfirstprinciples)tobe:

m

SDD: maximize yibii=1

m

s.t.

yiAi

+

S

=

C

i=1

S0.

Oneconvenientwayofthinkingaboutthisproblemisasfollows. Givenmul-mtipliersy1, . . . , ym,theobjectiveistomaximizethelinearfunction

i=1 yibi.TheconstraintsofSDDstatethatthematrixS definedas

m

S=C yiAi

i=1

mustbepositivesemidefinite. That is,

m

C yiAi

0.i=1

We illustrate this construction with the example presented earlier. Thedualproblem is:

9


10/54

SDD

:

maximize

11y1 + 19y2

1 0 1 0 2 8 1 2 3 s.t. y1 0 3 7 +y2 2 6 0 +S=2 9 0 1 7 5 8 0 4 3 0 7

S0,

whichwecanrewrite inthefollowingform:

SDD: maximize 11y1 + 19y2

s.t.

11y1 0y2 20y1 2y2 31y1 8y2

20y1 2y2 93y1 6y2 07y1 0y2 0.31y1 8y2 07y1 0y2 75y1 4y2

It

is

often

easier

to

see

and

to

work

with

a

semidefinite

program

when

itispresentedintheformatofthedualSDD,sincethevariablesarethemmultipliersy1, . . . , ym.

Asinlinearprogramming,wecanswitchfromoneformatofSDP (pri-mal or dual) to any other format with great ease, and there is no loss ofgenerality in assuming a particular specific format for the primal or thedual.

The following proposition states that weak duality must hold for theprimalanddualofSDP:

10


11/54

Proposition 5.1 Given a feasible solution X of SDP and a feasible solumtion (y,

S)

of SDD, the duality gap is C

X

=

S

X

0. Ifi=1 yibimCX i=1 yibi = 0, then Xand (y,S)are each optimal solutions to SDP

and SDD, respectively, and furthermore, SX= 0.

InordertoproveProposition5.1,itwillbeconvenienttoworkwiththetrace ofamatrix,definedbelow:

n

trace(M) = Mjj .j=1

Simple arithmetic can be used to establish the following two elementaryidentities:

trace(M N) = trace(N M)

AB=trace(ATB)

Proof of Proposition 5.1. For the first part of the proposition, wemustshowthat ifS0 and X0,thenSX0. LetS =P DPT andX=QEQT whereP,QareorthonormalmatricesandD,Earenonnegativediagonal

matrices.

We

have:

SX=trace(STX) = trace(SX) = trace(P DPTQEQT)

n

=trace(DPTQEQTP) = Djj (PT

QEQTP)jj 0,j=1

wherethelastinequalityfollowsfromthefactthatallDjj

0andthefact

11


12/54

thatthediagonalofthesymmetricpositivesemidefinitematrixPTQEQTPmust

be

nonnegative.

Toprovethesecondpartoftheproposition,supposethattrace(SX) = 0. Thenfromtheaboveequalities,wehave

n

Djj (PTQEQTP)jj

= 0.j=1

However, this implies that for eachj = 1, . . . , n, either Djj

= 0 or the(PTQEQTP)jj

=0. Furthermore, the lattercase implies thatthejth rowof PTQEQTP is all zeros. Therefore DPTQEQTP = 0, and so SX =P DPTQEQT = 0. q.e.d.

Unlikethecaseoflinearprogramming,wecannotassertthateitherSDPor SDD will attain their respective optima, and/or that there will be no

duality

gap,

unless

certain

regularity

conditions

hold.

One

such

regularity

condition which ensures that strong duality will prevail is a version of theSlatercondition,summarized inthe followingtheoremwhichwewillnotprove:

Theorem 5.1 Let zP

and zD

denote the optimal objective function valuesof SDPand SDD, respectively. Suppose that there exists a feasible solution y, S)X of SDP such that X0, and that there exists a feasible solution (of SDDsuch that S 0. Then both SDP and SDDattain their optimal

values, and zP =zD.

12


13/54

6 Key Properties of Linear Programming that do

not extend to SDP

The followingsummarizessomeofthemore importantpropertiesof linearprogrammingthatdonotextendtoSDP:

Theremaybeafiniteorinfinitedualitygap. Theprimaland/ordualmay or may not attain their optima. As noted above in Theorem5.1, both programs will attain their common optimal value if bothprogramshavefeasiblesolutionsintheinteriorofthesemidefinitecone.

There

is

no

finite

algorithm

for

solving

SDP.

There

is

a

simplex

algorithm, but it is not a finite algorithm. There is no direct analogofabasicfeasiblesolutionforSDP.

Givenrationaldata,thefeasibleregionmayhavenorationalsolutions.The optimal solution may not have rational components or rationaleigenvalues.

GivenrationaldatawhosebinaryencodingissizeL,thenormsofany

feasibleand/oroptimalsolutionsmayexceed22L

(orworse).

GivenrationaldatawhosebinaryencodingissizeL,thenormsofanyfeasible

and/or

optimal

solutions

may

be

less

than

22L

(or

worse).

7 SDP in Combinatorial Optimization

SDP has wide applicability in combinatorial optimization. A number ofNPhardcombinatorialoptimizationproblemshaveconvexrelaxationsthatare semidefinite programs. Inmany instances, theSDP relaxation is verytightinpractice,andincertaininstancesinparticular,theoptimalsolutiontotheSDP relaxationcanbeconvertedtoafeasiblesolutionfortheorigi-nalproblemwithprovablygoodobjectivevalue. Anexampleoftheuseof

SDP

in

combinatorial

optimization

is

given

below.

13


14/54

7.1 An SDP Relaxation of the MAX CUT Problem

LetGbeanundirectedgraphwithnodesN ={1, . . . , n},andedgesetE.Let wij

=wji

be theweight on edge (i, j), for(i, j)E. We assume thatwij

0forall(i, j)E. TheMAXCUTproblemistodetermineasubsetS of the nodesN forwhichthesumofthe weightsoftheedgesthatcross

fromS to itscomplementS ismaximized(where(S:=N\S) .

WecanformulateMAXCUTasan integerprogramasfollows. Letxj = 1 forjS andxj

=1 for jS. Thenourformulation is:

n n

1MAXCUT

:

maximizex 4 wij(1xixj)i=1j=1

s.t. xj {1, 1}, j = 1, . . . , n .

Now let

Y =xxT,

whereby

Yij =xixj i= 1, . . . , n , j = 1, . . . , n .

Also let W be the matrix whose (i, j)th element is wij

for i = 1, . . . , n andj= 1, . . . , n. ThenMAXCUTcanbeequivalentlyformulatedas:

14


15/54

n n

1MAXCUT: maximizeY,x wijWY4i=1j=1

s.t. xj

{1, 1}, j = 1, . . . , n

Y =xxT.

Notice in this problem that the first set of constraints are equivalent toYjj = 1, j = 1, . . . , n. Wethereforeobtain:

n n1MAXCUT: maximizeY,x

wijWY4i=1j=1

s.t. Yjj = 1, j = 1, . . . , n

Y =xxT.

Last of all, notice that the matrix Y = xxT is a symmetric rank-1 posi-tive semidefinite matrix. If we relax this condition by removing the rank-1 restriction, we obtain the following relaxtion of MAX CUT, which is a

semidefinite

program:

n n

1RELAX : maximizeY 4 wijWY

i=1j=1

s.t. Yjj

= 1, j = 1, . . . , n

Y 0.

It isthereforeeasytoseethatRELAXprovidesanupperboundonMAX-

CUT,

i.e.,

15


16/54

MAXCUT

RELAX.

Asitturnsout,onecanalsoprovewithouttoomucheffortthat:

0.87856RELAXMAXCUTRELAX.

This isan impressiveresult, inthatitstatesthatthevalueofthesemidefi-

nite

relaxation

is

guaranteed

to

be

no

more

than

12%

higher

than

the

value

ofNP-hardproblemMAXCUT.

8 SDP in Convex Optimization

As stated above, SDP has very wide applications in convex optimization.ThetypesofconstraintsthatcanbemodeledintheSDPframeworkinclude:linear inequalities, convex quadratic inequalities, lower bounds on matrixnorms, lower bounds on determinants of symmetric positive semidefinitematrices,lowerboundsonthegeometricmeanofanonnegativevector,plus

many

others.

Using

these

and

other

constructions,

the

following

problems

(among many others) can be cast in the form of a semidefinite program:linearprogramming,optimizingaconvexquadratic formsubjecttoconvexquadraticinequalityconstraints,minimizingthevolumeofanellipsoidthatcovers a given set of points and ellipsoids, maximizing the volume of anellipsoid that is contained in a given polytope, plus a variety of maximumeigenvalueandminimumeigenvalueproblems. Inthesubsectionsbelowwedemonstrate how some important problems in convex optimization can bere-formulatedas instancesofSDP.

16


17/54

8.1

SDP for Convex Quadratically Constrained Quadratic

Programming, Part I

A convex quadraticallyconstrained quadratic program is a problem oftheform:

TQCQP : minimize xTQ0x +q0 x +c0x

Ts.t. xTQix +qi

x +ci 0 , i = 1, . . . , m ,

where

the

Q0

0 and

Qi

0, i

= 1, . . . , m.

This

problem

is

the

same

as:

QCQP : minimize x,

Ts.t. xTQ0x +q0 x +c0 0TxTQix +qi

x +ci

0 , i = 1, . . . , m .

WecanfactoreachQi into

Qi

=MiTMi

forsomematrixMi. Thennotetheequivalence:

I Mix T T

ci

qiTx0 x Qix +qi

x +ci 0.xTMTi

InthiswaywecanwriteQCQP as:

17


18/54

QCQP : minimize x,s.t.

I M0x 0TxTM0T c0 q0 x+

I Mix 0 , i = 1, . . . , m . xTMi

T ci

qiTx

Notice in the above formulation that the variables are and x and thatallmatrixcoefficientsare linearfunctionsofandx.

8.2

SDP for Convex Quadratically Constrained Quadratic

Programming, Part II

As itturnsout,there isanalternativewayto formulateQCQP asasemi-definiteprogram. Webeginwiththefollowingelementaryproposition.

Proposition 8.1 Given a vector x

k and a matrix W

kk, thenT1 x 0 if andonly if W xxT.

x W

Usingthisproposition,itisstraightforwardtoshowthatQCQP isequiv-alenttothefollowingsemi-definiteprogram:

18


19/54

QCQP

:

minimize

x,W,s.t.

1 T 1 xc0 2q0

T

012q0 Q0 x W

T1 T 1 xci 2qi 0 , i = 1, . . . , m1

2qi Qi x W

T1 x 0

.

x W

Notice inthisformulationthattherearenow (n+1)(n+2) variables,butthat2theconstraintsarealllinearinequalitiesasopposedtosemi-definiteinequal-ities.

8.3 SDP for the Smallest Circumscrib ed Ellipsoid Problem

AgivenmatrixR 0 and a given point zcanbeusedtodefineanellipsoidinn:

ER,z :={y | (y z)TR(y z) 1}.

OnecanprovethatthevolumeofER,z

isproportionaltodet(R1).

SupposewearegivenaconvexsetX n describedastheconvexhull

19


20/54

ofkpointsc1

, . . . , ck

. Wewouldliketofindanellipsoidcircumscribingthesek

points

that

has

minimum

volume.

Our

problem

can

be

written

in

the

fol-lowingform:

M CP: minimize vol(ER,z)R, zs.t. ci

ER,z

, i = 1, . . . , k ,

which isequivalentto:

M CP: minimize ln(det(R))R, zs.t. (ci

z)TR(ci

z)1, i = 1, . . . , k

R0,

Now factor R = M2 where M 0 (that is, M is a square root of R),andnowM CPbecomes:

M CP

:

minimize

ln(det(M2))

M, zs.t. (ci

z)TMTM(ci

z)1, i = 1, . . . , k ,

M0.

Nextnoticetheequivalence:

I M ciM z (M ci

M z)T 1

0 (ciz)TMTM(ciz)1

20


21/54

InthiswaywecanwriteM CPas:

M CP: minimize 2ln(det(M))M, z

s.t.I

(Mci

M z)TMciM z

10, i = 1, . . . , k ,

M0.

Lastofall,makethesubstitutiony=Mz toobtain:

MCP: minimize 2ln(det(M))M, y

I Mci

y

s.t.(M ciy)T 1 0, i = 1, . . . , k ,

M0.

Noticethatthis lastprogram involvessemidefiniteconstraints whereallof

the

matrix

coefficients

are

linear

functions

of

the

variables

M

and

y. How-ever,theobjectivefunctionisnotalinearfunction. ItispossibletoconvertthisproblemfurtherintoagenuineinstanceofSDP,becausethereisawaytoconvertconstraintsoftheform

ln(det(X))

to a semidefinite system. Nevertheless, this is not necessary, either fromatheoreticalorapracticalviewpoint,becauseitturnsoutthatthefunction

f(X) =

ln(det(X))

is

extremely

well-behaved

and

is

very

easy

to

optimize(both intheoryand inpractice).

21


22/54

Finally,notethataftersolvingthe formulationofM CP above, wecanrecoverthematrixRandthecenterzoftheoptimalellipsoidbycomputing

R=M2 and z=M1y.

8.4 SDP for the Largest Inscribed Ellipsoid Problem

RecallthatagivenmatrixR 0andagivenpointz canbeusedtodefineanellipsoidin

n:

ER,z :={x |(x z)TR(x z) 1},

andthatthevolumeofER,z

isproportionaltodet(R1).

SupposewearegivenaconvexsetX n describedastheintersectionofkhalfspaces

{x

|(ai)

Tx

bi

}, i = 1, . . . , k,that is,

X={x | Ax b}

where the ith row of the matrix A consists of the entries of the vectorai, i = 1, . . . , k. We would like tofind an ellipsoid inscribed inX of maxi-mumvolume. Ourproblemcanbewritten inthefollowingform:

M IP: maximize vol(ER,z

)

R, z

s.t. ER,z X,

22


23/54

which isequivalentto:

MIP: maximize det(R1)R, zs.t. ER,z

{x| (ai)Tx bi}, i = 1, . . . , k

R 0,

which

is

equivalent

to:

M IP: maximize ln(det(R1))R, z

Ts.t. maxx{ai

x | (x z)TR(x z) 1} bi, i = 1, . . . , k

R 0.

For a given i = 1, . . . , k, the solution to the optimization problem in theith constraint is

R1aix =z+Tai R

1ai

withoptimalobjectivefunctionvalue

Tai

z+ aTi

R1ai

,

andsoMIP canberewrittenas:

23


24/54

M IP: maximize ln(det(R1))R, z

Ts.t. ai z+ aTR1ai

bi, i = 1, . . . , ki

R0.

Now factorR1 =M2 where M 0(that is, M isa squarerootofR1),andnowM IPbecomes:

M IP: maximize ln(det(M2))M, z

T Ts.t. ai z+ ai

MTMai

bi, i = 1, . . . , k

M0,

whichwecanre-writeas:

M IP

:

maximize

2

ln(det(M))

M, zT Ts.t. ai

MTM ai (biai

z)2, i = 1, . . . , k

Tbiai z0, i = 1, . . . , k

M0.

Nextnoticetheequivalence:

T

(bi

ai z)I Mai

aiTMTM ai

(bi

ai z)2 T 0 andT(Mai)T (bi

ai

z) T bi

ai

z0. 24


25/54

InthiswaywecanwriteM IPas:

M IP: minimize 2ln(det(M))M, z

s.t.(bi

aTi z)I(M ai)

T

M ai

(bi

aTi

z)0, i = 1, . . . , k ,

M0.

Noticethatthis lastprogram involvessemidefiniteconstraints whereallofthematrixcoefficientsare linear functionsofthevariablesM andz. How-ever, theobjective function isnot a linear function. It ispossible, butnotnecessaryinpractice,toconvertthisproblemfurtherintoagenuineinstanceofSDP,becausethere isawaytoconvertconstraintsoftheform

ln(det(X))

to

a

semidefinite

system.

Such

a

conversion

is

not

necessary,

either

from

atheoreticalorapracticalviewpoint,becauseitturnsoutthatthefunctionf(X) = ln(det(X))isextremelywell-behavedandisveryeasytooptimize(both intheoryand inpractice).

Finally, note that after solving the formulation of M IP above, we canrecoverthematrixRoftheoptimalellipsoidbycomputing

R=M2.

25


26/54

8.5 SDP for Eigenvalue Optimization

Therearemanytypesofeigenvalueoptimizationproblemsthatcanbefor-mualatedasSDPs. Atypicaleigenvalueoptimizationproblem istocreateamatrix

k

S :=B wiAii=1

givensymmetricdatamatricesBandAi, i = 1, . . . , k,usingweightsw1, . . . , wk,

in

such

a

way

to

minimize

the

difference

between

the

largest

and

smallest

eigenvalueofS. Thisproblemcanbewrittendownas:

EOP : minimize max(S)min(S)w,S

k

s.t. S=B wiAi,i=1

where min(S) and max(S)denotethe smallestand the largesteigenvalue

of

S,

respectively.

We

now

show

how

to

convert

this

problem

into

an

SDP.

RecallthatS canbefactored intoS=QDQT whereQ isanorthonor-mal matrix (i.e., Q1 =QT) and D is a diagonal matrix consisting of theeigenvaluesofS. Theconditions:

ISI

can

be

rewritten

as:

Q(I)QT QDQT Q(I)QT.

26


27/54

AfterpremultiplyingtheaboveQT andpostmultiplyingQ,theseconditionsbecome:

IDI

whichareequivalentto:

min(S) and max(S).

ThereforeEOP canbewrittenas:

EOP : minimize w,S,,

k

s.t. S=B wiAi

i=1

ISI.

This lastproblemisasemidefiniteprogram.

Using constructs such as those shown above, very many other types ofeigenvalueoptimizationproblemscanbeformulatedasSDPs.

27


28/54

9 SDP in Control Theory

AvarietyofcontrolandsystemproblemscanbecastandsolvedasinstancesofSDP. However,thistopic isbeyondthescopeofthesenotes.

10 Interior-point Methods for SDP

At the heart of an interior-point method is a barrier function that exertsa repelling force from the boundary of the feasible region. For SDP, we need a barrier function whose values approach +

as points X approach

the

boundary

of

the

semidefinite

cone

Sn

+.

Let X Sn+. Then X will have n eigenvalues, say 1(X), . . . , n(X)(possibly counting multiplicities). We can characterize the interior of thesemidefiniteconeasfollows:

intSn ={XSn |1(X)>0, . . . , n(X)>0}.+

A natural barrier function to use to repel X from the boundary of Sn+then is

n n

ln(i(X))=ln( i(X))=ln(det(X)).j=1 j=1

Consider the logarithmic barrier problem BSDP() parameterized bythepositivebarrierparameter:

28


29/54

BSDP

()

:

minimize

C

X

ln(det(X))

s.t. AiX=bi , i = 1, . . . , m , X0.

Let f

(X) denote the objective function of BSDP(). Then it is not toodifficulttoderive:

f(X) =

C

X1,

(1)

andsotheKarush-Kuhn-TuckerconditionsforBSDP()are:

Ai

X=bi

, i = 1, . . . , m , X0, (2)

m CX1 = y

iA

i.

i=1

Because X is symmetric, we can factorize X into X = LLT. We thencandefine

S=X1 =LTL1,

which implies

1LTSL =I,

29


30/54

andwecanrewritetheKarush-Kuhn-Tuckerconditionsas:

Ai

X=bi

, i = 1, . . . , m , X0, X=LLT

(3)m yiAi

+S=C i=1

I1LTSL = 0.

Fromtheequationsof(3)itfollowsthatif(X,y,S)isasolutionof(3),thenX isfeasibleforSDP, (y,S) is feasible for SDD ,andtheresultingdualitygapis

n n n n

SX= Sij

Xij

= (SX)jj

= =n.i=1 j=1 j=1 j=1

This suggests that we try solving BSDP() for a variety of values of as0.

However,wecannotusuallysolve(3)exactly,becausethefourthequationgroupisnotlinearinthevariables. Wewillinsteaddefinea-approximatesolution of the Karush-Kuhn-Tucker conditions (3). Before doing so, weintroducethefollowingnormonmatrices,calledtheFrobeniusnorm:

n n

M

:= M

M = M2 .ij

i=1 j=1

ForsomeimportantpropertiesoftheFrobeniusnorm,seethelastsubsection

30


31/54

ofthis section. A-approximatesolutionofBSDP() is defined as any solution

(X,

y,

S) of

Ai X=bi , i = 1, . . . , m ,

X 0, X=LLT

(4)m yiAi

+S=C i=1

I 1LTSL .

X, Lemma 10.1 If ( y, S) is a -approximate solution of BSDP() and < 1, then X is feasible for SDP, ( y, S) is feasible for SDD, and theduality gap satisfies:

m

n(1 ) C X yibi

=X S n(1+). (5)i=1

Proof: Primal

feasibility

is

obvious.

To

prove

dual

feasibility,

we

needtoshowthatS 0. Toseethis,define

1LT R=I SL (6)

andnotethatR < 1. Rearranging(6),weobtain L1 0S=LT(I R)

XS) =becauseR


32/54

q.e.d.

10.1 The Algorithm

Based on the analysisjust presented, we are motivated to develop the fol-lowingalgorithm:

Step 0 . Initialization. Data is (X0, y0, S0, 0). k = 0. Assume that(X0, y0, S0) is a -approximatesolutionofBSDP(0)forsomeknownvalue

of

that

satisfies

4

.

Then for all [0, 1),

2

LT X) +

0 L1D LT +2(1 ) .f0 X+L1D D f0(

In particular,

0.2

LT X) 0.025

0.

(11)f0 X+L1D D f0(

Inordertoprovethisproposition,wewillneedtwopowerfulfactsaboutthelogarithmfunction:

Fact 1. Supposethat |x| < 1. Then2x

ln(1+x) x .2(1

)

Proof: We have:3 4

ln(1+x) = x x2 +x x +. . .2 3 4

x |x|2 |x|3 |x|4 . . .2 3 4

x |x|2 |x|3 |x|4 . . .2 2 22

= x x 1 + |x| +|x|2 +|x|3 +. . .22

=

x

x

2(1|x|)

2x x 2(1) .

44


45/54

q.e.d.

Fact 2. SupposethatR Sn andthatR < 1. Then

2ln(det(I+R)) I R .

2(1 )

Proof: Factor R = QDQT where Q is orthonormal and D is a diagonaln

matrix of the eigenvalues ofR. Then first note thatR = D2 . We jjthen

have: ln(det(I+R)) =

=

=

=

=

q.e.d.

Proof of Proposition 10.2. Let

j=1

ln(det(I+QDQT))ln(det(I+D))

n

ln(1+Djj )j=1

n D2

jjDjj 2(1)j=1

R2I D 2(1)2I QTRQ 2(1)

2I R 2(1)

=

LT ,L1D

andnoticethatL1D LT =.

45


46/54

Then

f0 X+

D = f0 X+DL1DLT

L(I+L1D LT))= CX+CD 0 ln(det( LT)

= CX0 ln(det( X))+CD 0 ln(det(I+L1D LT)

X) + CD 0IL1D 2(1) f0( LT +0 2

2= f0( L1X) + CD 0LT D +0 2(1)

2X) + C0 = f0( X1 D +0 2(1) .

Now,(D

, y )solvetheNormalequations:

X1D

X1

C0 X1 +0 = m yi

Ai

i=1 (12)

Ai

D = 0, i = 1, . . . , m .

Takingthe innerproductofbothsidesofthefirstequationabovewithD

andrearrangingyields:

X1D

0 X1D =(C0 X1)D.

Substitutingthis inour inequalityaboveyields:

46


47/54

X1D 2f0 X+

D f0( X)0 X1D +0 2(1)L1DLT

X)0 LT L1D 2(1)= f0( L1D LT +0

2

LT2 +0 2= f0(X)0L1D 2(1)

LT+0 2= f0(X)0L1D 2(1)2= f0(X) +

0

L1D

LT

+2(1) .

1Subsituting= 0.2andandL1DLT> 4 yieldsthefinalresult.q.e.d.

Lastofall,weproveaboundonthenumberofiterationsthatthealgo-rithmwillneed inordertofinda 14 -approximatesolutionofBSDP(

0):

Proposition 10.3 Suppose that X0 satisfies Ai

X0 =bi, i = 1, . . . , m, andX0

0. Let 0 be given and let f0 be the optimal objective function value

of BSDP

(0). Then the algorithm initiated at X0 will find a 14 -approximatesolution of BSDP(0) in at most

f0(X0)f0k=

0.0250

iterations.

Proof: Thisfollows immediatelyfrom(11). Eachiterationthat isnota 1 -4approximatesolutiondecreasestheobjectivefunctionf0(X) of BSDP(

0)byat least0.0250. Therefore,therecannotbemorethan

f0(X0)f0

0.0250

47


48/54

iterationsthatarenot 14

-approximatesolutionsofBSDP(0).q.e.d.

10.5 Some Properties of the Frobenius Norm

Proposition 10.4 If M Sn, then

n

1. M = (j(M))2, where 1(M), 2(M), . . . , n(M) is an enu-j=1

meration of the neigenvalues of M.

2. If is any eigenvalue of M, then || M.

3. |trace(M)| nM.4. If M

0,

and

so

M

0.

q.e.d.

48


49/54

Proposition 10.5 If A,B

Sn, then

AB

A

B.

Proof: We have 2

n n

n

AB = AikBkji=1 j=1 k=1

n

n n

n

A2 B2 ik

kj

i=1 j=1 k=1 k=1

n n n n

= A2 B2 =AB.ik kji=1 k=1 j=1 k=1q.e.d.

11 Issues in Solving SDP using the Ellipsoid Al

gorithm

Toseehowtheellipsoidalgorithm isusedtosolveasemidefiniteprogram,

assume

for

convenience

that

the

format

of

the

problem

is

that

of

the

dual

problemSDD. Thenthefeasibleregionoftheproblemcanbewrittenas:

m

mF = (y1, . . . , ym) |C yiAi 0 ,i=1

mand the objective function is Note that F isjust a convex re-i=1 yibi.mgion in .

Recallthatatanyiterationoftheellipsoidalgorithm,thesetofsolutionsofSDDisknowntolieinthecurrentellipsoid,andthecenterofthecurrent

49


50/54

y= ( y

F, thenweperformanoptimalityellipsoid is,say, y1

, . . . , ym

). If m mcut

of

the

form i=1 yibi i=1 yibi,andusestandardformulastoupdate

theellipsoidand itsnewcenter. If F,thenweperformafeasibilitycuty /h m suchthatbycomputingavector hTy > hTy forally F.

There are four issues that must be resolved in order to implement theaboveversionoftheellipsoidalgorithmtosolveSDD:

1. Testing if y F. This is done by computing the matrix S = Cm

yiAi. If S

0, then y

F. Testing if the matrix S

0 can i=1

be

done

by

computing

an

exact

Cholesky

factorization

of

S,

which

takesO(n3)operations,assumingthatcomputingsquarerootscanbeperformed(exactly)inoneoperation.

2. Computing a feasibility cut. As above, testing if S 0 can be com-putedinO(n3)operations. IfS 0,thenagainassumingexactarith-metic, we can find an n-vector v Sv such that T v < 0 in O(n3) oper-ations as well. Then the feasiblity cut vector h is computed by theformula:

v

hi

=

T

Aiv,

i

= 1, . . . , m ,

whose computation requires O(mn2)operations. Noticethat for anyy F,that

m mT

Tvy h= yivT

Aiv = yiAi

vi=1 i=1

m m

Tv < v yiAi v = yivT

Aiv = T

vTC y h,i=1 i=1

therebyshowing y.h indeedprovidesafeasibilitycutforF aty=

3. Startingtheellipsoidalgorithm. WeneedtodetermineanupperboundR on the distance of some optimal solution y from the origin. This

50


51/54

cannot bedonebyexaminingtheinputlengthofthedata,asisthecasein

linear

programming.

One

needs

to

know

some

special

information

about the specific problem at hand in order to determine R beforesolvingthesemidefiniteprogram.

4. Stopping the ellipsoid algorithm. Suppose that we seek an-optimalsolutionofSDD. Inordertoproveacomplexityboundonthenumberof iterations needed to find an -optimal solution, we need to knowbeforehand the radius r of a Euclidean ball that is contained in theset of -optimal solutions of SDD. The value of r also cannot bedeterminedbyexamining the input lengthofthedata, as isthe casein linearprogramming. One needstoknow some special information

about

the

specific

problem

at

hand

in

order

to

determine

r

beforesolvingthesemidefiniteprogram.

12 Current Research in SDP

There are many very active research areas in semidefinite programming innonlinear(convex)programming,incombinatorialoptimization,andincon-trol theory. In the area of convex analysis, recent research topics includethegeometryandtheboundarystructureofSDPfeasibleregions(includingnotionsofdegeneracy)andresearchrelatedtothecomputationalcomplex-

ity

of

SDP

such

as

decidability

questions,

certificates

of

infeasibility,

and

duality theory. In the area of combinatorial optimization, there has beenmuchresearchonthepracticalandthetheoreticaluseofSDPrelaxationsofhardcombinatorialoptimizationproblems. Asregards interiorpointmeth-ods, there are a host of research issues, mostly involving the developmentofdifferentinteriorpointalgorithmsandtheirproperties,includingratesofconvergence,performanceguarantees,etc.

13 Computational State of the Art of SDP

Because

SDP

has

so

many

applications,

and

because

interior

point

methods

showsomuchpromise,perhapsthemostexcitingareaofresearchonSDPhastodowithcomputationandimplementationofinteriorpointalgorithms

51


52/54

for solving SDP. Much research has focused on the practical efficiency ofinterior

point

methods

for

SDP

.

However,

in

the

research

to

date,

com-putational issues have arisen that are much more complex than those forlinear programming, and these computational issues are only beginning tobewell-understood. Theyprobablystemfromavarietyoffactors,includingthefactthatSDP isnotguaranteedtohavestrictlycomplementaryoptimalsolutions (as is the case in linear programming). Finally, because SDP issuch a new field, there is no representative suite of practical problems onwhich to test algorithms, i.e., there is no equivalent version of thenetlibsuiteof industrial linearprogrammingproblems.

Agoodwebsiteforsemidefiniteprogramming is:

http://www.zib.de/helmberg/semidef.html.

14 Exercises

n

1. Fora(square)matrixM IRnn,definetrace(M) = Mjj ,andforj=1

twomatricesA,B

IRkl define

k l

A B:= AijBij .i=1 j=1

Provethat:

(a) A B=trace(ATB).(b) trace(M N)=trace(N M).

2. LetSkk denotetheconeofpositivesemi-definitesymmetricmatrices,+nnamelySkk ={X Skk | vTXv0 forallv }. Considering+

SkkSkk as

a

cone,

prove

that

=

Snn,

thus

showing

that

Skk+ + + +isself-dual.

52


53/54

3. Considertheproblem:

(P): minimized dTQd

s.t. dTM d = 1 .

IfM0,showthat(P) isequivalentto:(S): minimizeX QX

s.t. MX= 1

X

0 .

What istheSDPdualof(S)?

4. ProvethatXxxT

ifandonly if

X x 0.xT 1

5. LetK :={XSkk |X0}anddefine

K

:=

{S

Skk

|

S

X

0

for

all

X

0}

.

ProvethatK =K.

6. Let min denote the smallest eigenvalue of the symmetric matrix Q.Showthatthefollowingthreeoptimizationproblemseachhaveoptimalobjectivefunctionvalueequaltomin.

(P1): minimized

dTQd

s.t. dTId = 1 .

(P2)

:

maximize

s.t. QI .

53


54/54

(P3): minimizeX

Q X

s.t. IX= 1 X0 .

7. SupposethatQ0. Provethefollowing:

Tx Qx +qTx +c0

ifandonly ifthereexistsW forwhich

12q

TT 1 xT 1 xc 0 and 0.1

2q Q x W x W

Hint: usethepropertyshown inExercise4.

8. Considerthematrix

A BM = ,

BT C

whereA, B aresymmetricmatricesandA isnonsingular. ProvethatM0ifandonly ifCBTA1B0.

Documents

Lec23 Semidef Opt