Lec23 Semidef Opt

Embed Size (px)

Citation preview

  • 7/26/2019 Lec23 Semidef Opt

    1/54

    Introduction to Semidefinite Programming (SDP)

    Robert M. Freund

    March, 2004

    1

    2004 Massachusetts Institute of Technology.

  • 7/26/2019 Lec23 Semidef Opt

    2/54

    1 Introduction

    Semidefiniteprogramming(SDP)isthemostexcitingdevelopmentinmath-ematicalprogramminginthe1990s. SDP hasapplications insuchdiversefields as traditional convex constrained optimization, control theory, andcombinatorial optimization. Because SDP is solvable via interior pointmethods, most of these applications can usually be solved very efficientlyinpracticeaswellas intheory.

    2 Review of Linear Programming

    Considerthe linearprogrammingproblem instandardform:

    LP : minimize c x

    s.t. ai

    x=bi, i = 1, . . . , m nx +.

    Herexisavectorofnvariables,andwewritec xforthe inner-product

    n

    j=1 cjxj,etc.

    n n

    nAlso,+ :={x | x 0}, and we call + thenonnegativeorthant.Infact,n isaclosed convex cone, whereK iscalledaclosedaconvexcone+ifK satisfiesthefollowingtwoconditions:

    Ifx,w K,thenx+w K forallnonnegativescalarsand.

    K

    is

    a

    closed

    set.

    2

  • 7/26/2019 Lec23 Semidef Opt

    3/54

    Inwords,LP isthefollowingproblem:

    Minimize the linear function c x, subject to the condition that xmust solvemgiven equations ai

    x=bi, i = 1, . . . , m, and that xmust lie in the closednconvex cone K=+.

    Wewillwritethestandard linearprogrammingdualproblemas:

    m

    LD: maximize yibii=1

    m

    s.t. yiai+s=ci=1

    ns +.

    Given a feasible solution x of LP and a feasible solution (y,s) of LD, them mdualitygapissimplyc x i=1 yibi = (c x=s x 0,becausei=1 yiai)

    x 0 and s 0. WeknowfromLP dualitytheorythatso longasthepri-malproblemLP is feasibleandhasboundedoptimalobjectivevalue,then

    the

    primal

    and

    the

    dual

    both

    attain

    their

    optima

    with

    no

    duality

    gap.

    That

    is,thereexistsx and(y , s )feasiblefortheprimalanddual,respectively,

    m suchthatc x i=1 yibi =s x = 0.

    3 Facts about Matrices and the Semidefinite Cone

    IfX isann nmatrix,thenX isapositivesemidefinite(psd)matrixif

    vTXv 0 forany v n.

    3

  • 7/26/2019 Lec23 Semidef Opt

    4/54

    IfX isann nmatrix,thenX isapositivedefinite(pd)matrix if

    vTXv>0 forany v n, v = 0.

    Let Sn denote the set of symmetric nn matrices, and let nS+ denotethe set of positive semidefinite (psd) n n symmetric matrices. SimilarlyletSn denotethesetofpositivedefinite(pd)n nsymmetricmatrices.++

    LetX andY beanysymmetricmatrices. WewriteX 0todenotethat X is symmetric and positive semidefinite, and we write X Y to denotethatX Y 0. WewriteX 0todenotethatX issymmetricandpositivedefinite,etc.

    | nX 0} is a closed convex cone in 2 ofRemark 1 {X Snn (n+ 1)/2.

    nS+ =dimension

    Toseewhythisremark istrue,supposethat n

    X,W S+. Pickanyscalars

    ,

    0.

    For

    any

    v

    n, we have:

    vT(X+W)v=vTXv+vTW v 0,

    whereby n+ X W S + Thisshowsthat nS+ isacone. It isalsostraight-.forwardtoshowthat nS+ isaclosedset.

    Recallthefollowingpropertiesofsymmetricmatrices:

    If X Sn, then X = QDQT for some orthonormal matrix Q andsome diagonal matrix D. (Recall that Q is orthonormal means that

    4

  • 7/26/2019 Lec23 Semidef Opt

    5/54

    Q1 =QT,andthatD isdiagonalmeansthattheoff-diagonalentriesof

    D

    are

    all

    zeros.)

    If X = QDQT as above, then the columns of Q form a set of northogonaleigenvectorsofX,whoseeigenvaluesarethecorrespondingdiagonalentriesofD.

    X 0 if and only if X = QDQT where the eigenvalues (i.e., thediagonalentriesofD)areallnonnegative.

    X 0 if and only if X = QDQT where the eigenvalues (i.e., the

    diagonalentriesofD)areallpositive.

    If

    X

    0 and

    if

    Xii =

    0,

    then

    Xij =

    Xji = 0

    for

    all

    j

    = 1, . . . , n.

    ConsiderthematrixM definedasfollows:

    P vM = Tv d

    ,

    where P 0, v is a vector, and d is a scalar. Then M 0 if and TP1only ifd v v > 0.

    4 Semidefinite Programming

    LetXSn. WecanthinkofX asamatrix,orequivalently,asanarrayofn2 components of the form (x11, . . . , xnn). We can alsojust think of X asanobject(avector) inthespaceSn. AllthreedifferentequivalentwaysoflookingatX willbeuseful.

    Whatwilla linearfunctionofX look like? IfC(X) isa linear function

    of

    X,

    then

    C(X)

    can

    be

    written

    as

    C

    X,

    where

    5

  • 7/26/2019 Lec23 Semidef Opt

    6/54

    n n

    CX := Cij

    Xij

    .i=1 j=1

    IfX isa symmetricmatrix,there isno lossof generality inassumingthatthe matrix C is also symmetric. With this notation, we are now ready todefine a semidefinite program. A semidefinite program (SDP) is an opti-mizationproblemoftheform:

    SDP : minimize CX

    s.t. AiX=bi , i = 1, . . . , m ,

    X0,

    Notice that in an SDP that the variable is the matrix X, but it mightbe helpful to think of X as an array of n2 numbers or simply as a vectorinSn. Theobjective function isthe linearfunctionCX andtherearemlinear equations that X must satisfy, namely Ai

    X = bi

    , i = 1, . . . , m.ThevariableXalsomustlieinthe(closedconvex)coneofpositivesemidef-inite symmetric matrices Sn

    +. Note that the data for SDP consists of the

    symmetric

    matrix

    C

    (which

    is

    the

    data

    for

    the

    objective

    function)

    and

    the

    msymmetricmatricesA1, . . . , Am,andthemvectorb,whichformthemlinearequations.

    Let us see an example of an SDP for n = 3 and m = 2. Define thefollowingmatrices:

    1 0 1 0 2 8 1 2 3

    A1 =0 3 7 , A2 =2 6 0 , and C=2 9 0 ,1 7 5

    8 0 4

    3 0 7

    6

  • 7/26/2019 Lec23 Semidef Opt

    7/54

    andb1 = 11 and b2 =19. ThenthevariableX willbethe33symmetricmatrix:

    x11 x12 x13X=x21 x22 x23 ,

    x31 x32 x33

    andso,forexample,

    CX = x11 + 2x12 + 3x13 + 2x21 + 9x22 + 0x23 + 3x31 + 0x32 + 7x33

    = x11 + 4x12 + 6x13 + 9x22 + 0x23 + 7x33.

    since, in particular, X is symmetric. Therefore the SDP can be writtenas:

    SDP : minimize x11 + 4x12 + 6x13 + 9x22 + 0x23 + 7x33

    s.t.

    x11 + 0x12 + 2x13 + 3x22 + 14x23 + 5x33 = 11

    0x11 + 4x12 + 16x13 + 6x22 + 0x23 + 4x33 = 19

    x11 x12 x13X=x21 x22 x23 0.

    x31 x32 x33

    Notice that SDP looks remarkably similar to a linear program. However,

    the

    standard

    LP

    constraint

    that

    x

    must

    lie

    in

    the

    nonnegative

    orthant

    is

    re-placedbytheconstraintthatthevariableX mustlieintheconeofpositivesemidefinitematrices. Justasx0statesthateachofthencomponents

    7

  • 7/26/2019 Lec23 Semidef Opt

    8/54

    ofxmustbenonnegative, itmaybehelpfultothinkofX

    0asstatingthat

    each

    of

    the

    n

    eigenvalues of

    X

    must

    be

    nonnegative.

    It is easy to see that a linear program LP is a special instance of anSDP. Toseeonewayofdoingthis,supposethat(c, a1, . . . , am, b1, . . . , bm)comprisethedataforLP. Thendefine:

    ai1 0 . . . 0 c1 0 . . . 0

    0

    ai2 . . . 0

    0 c2 . . . 0

    Ai = . . . . , i = 1, . . . , m , and C= . . . . . ..

    . .. .

    . . . .. .

    . .

    . .0 0 . . . ain 0 0 . . . cn

    ThenLP canbewrittenas:

    SDP : minimize CX

    s.t. Ai

    X=bi

    , i = 1, . . . , m , Xij = 0, i = 1, . . . , n , j =i + 1, . . . , n , X

    0,

    withtheassociationthat

    x1 0 . . . 0 0 x2 . . . 0

    X=

    . . .. .

    .. . . . .. .0 0 . . . xn

    Of course, in practice one would neverwant to convert an instance of LPintoan instanceofSDP. TheaboveconstructionmerelyshowsthatSDP

    8

  • 7/26/2019 Lec23 Semidef Opt

    9/54

    includes linearprogrammingasaspecialcase.

    5 Semidefinite Programming Duality

    ThedualproblemofSDP isdefined(orderivedfromfirstprinciples)tobe:

    m

    SDD: maximize yibii=1

    m

    s.t.

    yiAi

    +

    S

    =

    C

    i=1

    S0.

    Oneconvenientwayofthinkingaboutthisproblemisasfollows. Givenmul-mtipliersy1, . . . , ym,theobjectiveistomaximizethelinearfunction

    i=1 yibi.TheconstraintsofSDDstatethatthematrixS definedas

    m

    S=C yiAi

    i=1

    mustbepositivesemidefinite. That is,

    m

    C yiAi

    0.i=1

    We illustrate this construction with the example presented earlier. Thedualproblem is:

    9

  • 7/26/2019 Lec23 Semidef Opt

    10/54

    SDD

    :

    maximize

    11y1 + 19y2

    1 0 1 0 2 8 1 2 3 s.t. y1 0 3 7 +y2 2 6 0 +S=2 9 0 1 7 5 8 0 4 3 0 7

    S0,

    whichwecanrewrite inthefollowingform:

    SDD: maximize 11y1 + 19y2

    s.t.

    11y1 0y2 20y1 2y2 31y1 8y2

    20y1 2y2 93y1 6y2 07y1 0y2 0.31y1 8y2 07y1 0y2 75y1 4y2

    It

    is

    often

    easier

    to

    see

    and

    to

    work

    with

    a

    semidefinite

    program

    when

    itispresentedintheformatofthedualSDD,sincethevariablesarethemmultipliersy1, . . . , ym.

    Asinlinearprogramming,wecanswitchfromoneformatofSDP (pri-mal or dual) to any other format with great ease, and there is no loss ofgenerality in assuming a particular specific format for the primal or thedual.

    The following proposition states that weak duality must hold for theprimalanddualofSDP:

    10

  • 7/26/2019 Lec23 Semidef Opt

    11/54

    Proposition 5.1 Given a feasible solution X of SDP and a feasible solumtion (y,

    S)

    of SDD, the duality gap is C

    X

    =

    S

    X

    0. Ifi=1 yibimCX i=1 yibi = 0, then Xand (y,S)are each optimal solutions to SDP

    and SDD, respectively, and furthermore, SX= 0.

    InordertoproveProposition5.1,itwillbeconvenienttoworkwiththetrace ofamatrix,definedbelow:

    n

    trace(M) = Mjj .j=1

    Simple arithmetic can be used to establish the following two elementaryidentities:

    trace(M N) = trace(N M)

    AB=trace(ATB)

    Proof of Proposition 5.1. For the first part of the proposition, wemustshowthat ifS0 and X0,thenSX0. LetS =P DPT andX=QEQT whereP,QareorthonormalmatricesandD,Earenonnegativediagonal

    matrices.

    We

    have:

    SX=trace(STX) = trace(SX) = trace(P DPTQEQT)

    n

    =trace(DPTQEQTP) = Djj (PT

    QEQTP)jj 0,j=1

    wherethelastinequalityfollowsfromthefactthatallDjj

    0andthefact

    11

  • 7/26/2019 Lec23 Semidef Opt

    12/54

    thatthediagonalofthesymmetricpositivesemidefinitematrixPTQEQTPmust

    be

    nonnegative.

    Toprovethesecondpartoftheproposition,supposethattrace(SX) = 0. Thenfromtheaboveequalities,wehave

    n

    Djj (PTQEQTP)jj

    = 0.j=1

    However, this implies that for eachj = 1, . . . , n, either Djj

    = 0 or the(PTQEQTP)jj

    =0. Furthermore, the lattercase implies thatthejth rowof PTQEQTP is all zeros. Therefore DPTQEQTP = 0, and so SX =P DPTQEQT = 0. q.e.d.

    Unlikethecaseoflinearprogramming,wecannotassertthateitherSDPor SDD will attain their respective optima, and/or that there will be no

    duality

    gap,

    unless

    certain

    regularity

    conditions

    hold.

    One

    such

    regularity

    condition which ensures that strong duality will prevail is a version of theSlatercondition,summarized inthe followingtheoremwhichwewillnotprove:

    Theorem 5.1 Let zP

    and zD

    denote the optimal objective function valuesof SDPand SDD, respectively. Suppose that there exists a feasible solution y, S)X of SDP such that X0, and that there exists a feasible solution (of SDDsuch that S 0. Then both SDP and SDDattain their optimal

    values, and zP =zD.

    12

  • 7/26/2019 Lec23 Semidef Opt

    13/54

    6 Key Properties of Linear Programming that do

    not extend to SDP

    The followingsummarizessomeofthemore importantpropertiesof linearprogrammingthatdonotextendtoSDP:

    Theremaybeafiniteorinfinitedualitygap. Theprimaland/ordualmay or may not attain their optima. As noted above in Theorem5.1, both programs will attain their common optimal value if bothprogramshavefeasiblesolutionsintheinteriorofthesemidefinitecone.

    There

    is

    no

    finite

    algorithm

    for

    solving

    SDP.

    There

    is

    a

    simplex

    algorithm, but it is not a finite algorithm. There is no direct analogofabasicfeasiblesolutionforSDP.

    Givenrationaldata,thefeasibleregionmayhavenorationalsolutions.The optimal solution may not have rational components or rationaleigenvalues.

    GivenrationaldatawhosebinaryencodingissizeL,thenormsofany

    feasibleand/oroptimalsolutionsmayexceed22L

    (orworse).

    GivenrationaldatawhosebinaryencodingissizeL,thenormsofanyfeasible

    and/or

    optimal

    solutions

    may

    be

    less

    than

    22L

    (or

    worse).

    7 SDP in Combinatorial Optimization

    SDP has wide applicability in combinatorial optimization. A number ofNPhardcombinatorialoptimizationproblemshaveconvexrelaxationsthatare semidefinite programs. Inmany instances, theSDP relaxation is verytightinpractice,andincertaininstancesinparticular,theoptimalsolutiontotheSDP relaxationcanbeconvertedtoafeasiblesolutionfortheorigi-nalproblemwithprovablygoodobjectivevalue. Anexampleoftheuseof

    SDP

    in

    combinatorial

    optimization

    is

    given

    below.

    13

  • 7/26/2019 Lec23 Semidef Opt

    14/54

    7.1 An SDP Relaxation of the MAX CUT Problem

    LetGbeanundirectedgraphwithnodesN ={1, . . . , n},andedgesetE.Let wij

    =wji

    be theweight on edge (i, j), for(i, j)E. We assume thatwij

    0forall(i, j)E. TheMAXCUTproblemistodetermineasubsetS of the nodesN forwhichthesumofthe weightsoftheedgesthatcross

    fromS to itscomplementS ismaximized(where(S:=N\S) .

    WecanformulateMAXCUTasan integerprogramasfollows. Letxj = 1 forjS andxj

    =1 for jS. Thenourformulation is:

    n n

    1MAXCUT

    :

    maximizex 4 wij(1xixj)i=1j=1

    s.t. xj {1, 1}, j = 1, . . . , n .

    Now let

    Y =xxT,

    whereby

    Yij =xixj i= 1, . . . , n , j = 1, . . . , n .

    Also let W be the matrix whose (i, j)th element is wij

    for i = 1, . . . , n andj= 1, . . . , n. ThenMAXCUTcanbeequivalentlyformulatedas:

    14

  • 7/26/2019 Lec23 Semidef Opt

    15/54

    n n

    1MAXCUT: maximizeY,x wijWY4i=1j=1

    s.t. xj

    {1, 1}, j = 1, . . . , n

    Y =xxT.

    Notice in this problem that the first set of constraints are equivalent toYjj = 1, j = 1, . . . , n. Wethereforeobtain:

    n n1MAXCUT: maximizeY,x

    wijWY4i=1j=1

    s.t. Yjj = 1, j = 1, . . . , n

    Y =xxT.

    Last of all, notice that the matrix Y = xxT is a symmetric rank-1 posi-tive semidefinite matrix. If we relax this condition by removing the rank-1 restriction, we obtain the following relaxtion of MAX CUT, which is a

    semidefinite

    program:

    n n

    1RELAX : maximizeY 4 wijWY

    i=1j=1

    s.t. Yjj

    = 1, j = 1, . . . , n

    Y 0.

    It isthereforeeasytoseethatRELAXprovidesanupperboundonMAX-

    CUT,

    i.e.,

    15

  • 7/26/2019 Lec23 Semidef Opt

    16/54

    MAXCUT

    RELAX.

    Asitturnsout,onecanalsoprovewithouttoomucheffortthat:

    0.87856RELAXMAXCUTRELAX.

    This isan impressiveresult, inthatitstatesthatthevalueofthesemidefi-

    nite

    relaxation

    is

    guaranteed

    to

    be

    no

    more

    than

    12%

    higher

    than

    the

    value

    ofNP-hardproblemMAXCUT.

    8 SDP in Convex Optimization

    As stated above, SDP has very wide applications in convex optimization.ThetypesofconstraintsthatcanbemodeledintheSDPframeworkinclude:linear inequalities, convex quadratic inequalities, lower bounds on matrixnorms, lower bounds on determinants of symmetric positive semidefinitematrices,lowerboundsonthegeometricmeanofanonnegativevector,plus

    many

    others.

    Using

    these

    and

    other

    constructions,

    the

    following

    problems

    (among many others) can be cast in the form of a semidefinite program:linearprogramming,optimizingaconvexquadratic formsubjecttoconvexquadraticinequalityconstraints,minimizingthevolumeofanellipsoidthatcovers a given set of points and ellipsoids, maximizing the volume of anellipsoid that is contained in a given polytope, plus a variety of maximumeigenvalueandminimumeigenvalueproblems. Inthesubsectionsbelowwedemonstrate how some important problems in convex optimization can bere-formulatedas instancesofSDP.

    16

  • 7/26/2019 Lec23 Semidef Opt

    17/54

    8.1

    SDP for Convex Quadratically Constrained Quadratic

    Programming, Part I

    A convex quadraticallyconstrained quadratic program is a problem oftheform:

    TQCQP : minimize xTQ0x +q0 x +c0x

    Ts.t. xTQix +qi

    x +ci 0 , i = 1, . . . , m ,

    where

    the

    Q0

    0 and

    Qi

    0, i

    = 1, . . . , m.

    This

    problem

    is

    the

    same

    as:

    QCQP : minimize x,

    Ts.t. xTQ0x +q0 x +c0 0TxTQix +qi

    x +ci

    0 , i = 1, . . . , m .

    WecanfactoreachQi into

    Qi

    =MiTMi

    forsomematrixMi. Thennotetheequivalence:

    I Mix T T

    ci

    qiTx0 x Qix +qi

    x +ci 0.xTMTi

    InthiswaywecanwriteQCQP as:

    17

  • 7/26/2019 Lec23 Semidef Opt

    18/54

    QCQP : minimize x,s.t.

    I M0x 0TxTM0T c0 q0 x+

    I Mix 0 , i = 1, . . . , m . xTMi

    T ci

    qiTx

    Notice in the above formulation that the variables are and x and thatallmatrixcoefficientsare linearfunctionsofandx.

    8.2

    SDP for Convex Quadratically Constrained Quadratic

    Programming, Part II

    As itturnsout,there isanalternativewayto formulateQCQP asasemi-definiteprogram. Webeginwiththefollowingelementaryproposition.

    Proposition 8.1 Given a vector x

    k and a matrix W

    kk, thenT1 x 0 if andonly if W xxT.

    x W

    Usingthisproposition,itisstraightforwardtoshowthatQCQP isequiv-alenttothefollowingsemi-definiteprogram:

    18

  • 7/26/2019 Lec23 Semidef Opt

    19/54

    QCQP

    :

    minimize

    x,W,s.t.

    1 T 1 xc0 2q0

    T

    012q0 Q0 x W

    T1 T 1 xci 2qi 0 , i = 1, . . . , m1

    2qi Qi x W

    T1 x 0

    .

    x W

    Notice inthisformulationthattherearenow (n+1)(n+2) variables,butthat2theconstraintsarealllinearinequalitiesasopposedtosemi-definiteinequal-ities.

    8.3 SDP for the Smallest Circumscrib ed Ellipsoid Problem

    AgivenmatrixR 0 and a given point zcanbeusedtodefineanellipsoidinn:

    ER,z :={y | (y z)TR(y z) 1}.

    OnecanprovethatthevolumeofER,z

    isproportionaltodet(R1).

    SupposewearegivenaconvexsetX n describedastheconvexhull

    19

  • 7/26/2019 Lec23 Semidef Opt

    20/54

    ofkpointsc1

    , . . . , ck

    . Wewouldliketofindanellipsoidcircumscribingthesek

    points

    that

    has

    minimum

    volume.

    Our

    problem

    can

    be

    written

    in

    the

    fol-lowingform:

    M CP: minimize vol(ER,z)R, zs.t. ci

    ER,z

    , i = 1, . . . , k ,

    which isequivalentto:

    M CP: minimize ln(det(R))R, zs.t. (ci

    z)TR(ci

    z)1, i = 1, . . . , k

    R0,

    Now factor R = M2 where M 0 (that is, M is a square root of R),andnowM CPbecomes:

    M CP

    :

    minimize

    ln(det(M2))

    M, zs.t. (ci

    z)TMTM(ci

    z)1, i = 1, . . . , k ,

    M0.

    Nextnoticetheequivalence:

    I M ciM z (M ci

    M z)T 1

    0 (ciz)TMTM(ciz)1

    20

  • 7/26/2019 Lec23 Semidef Opt

    21/54

    InthiswaywecanwriteM CPas:

    M CP: minimize 2ln(det(M))M, z

    s.t.I

    (Mci

    M z)TMciM z

    10, i = 1, . . . , k ,

    M0.

    Lastofall,makethesubstitutiony=Mz toobtain:

    MCP: minimize 2ln(det(M))M, y

    I Mci

    y

    s.t.(M ciy)T 1 0, i = 1, . . . , k ,

    M0.

    Noticethatthis lastprogram involvessemidefiniteconstraints whereallof

    the

    matrix

    coefficients

    are

    linear

    functions

    of

    the

    variables

    M

    and

    y. How-ever,theobjectivefunctionisnotalinearfunction. ItispossibletoconvertthisproblemfurtherintoagenuineinstanceofSDP,becausethereisawaytoconvertconstraintsoftheform

    ln(det(X))

    to a semidefinite system. Nevertheless, this is not necessary, either fromatheoreticalorapracticalviewpoint,becauseitturnsoutthatthefunction

    f(X) =

    ln(det(X))

    is

    extremely

    well-behaved

    and

    is

    very

    easy

    to

    optimize(both intheoryand inpractice).

    21

  • 7/26/2019 Lec23 Semidef Opt

    22/54

    Finally,notethataftersolvingthe formulationofM CP above, wecanrecoverthematrixRandthecenterzoftheoptimalellipsoidbycomputing

    R=M2 and z=M1y.

    8.4 SDP for the Largest Inscribed Ellipsoid Problem

    RecallthatagivenmatrixR 0andagivenpointz canbeusedtodefineanellipsoidin

    n:

    ER,z :={x |(x z)TR(x z) 1},

    andthatthevolumeofER,z

    isproportionaltodet(R1).

    SupposewearegivenaconvexsetX n describedastheintersectionofkhalfspaces

    {x

    |(ai)

    Tx

    bi

    }, i = 1, . . . , k,that is,

    X={x | Ax b}

    where the ith row of the matrix A consists of the entries of the vectorai, i = 1, . . . , k. We would like tofind an ellipsoid inscribed inX of maxi-mumvolume. Ourproblemcanbewritten inthefollowingform:

    M IP: maximize vol(ER,z

    )

    R, z

    s.t. ER,z X,

    22

  • 7/26/2019 Lec23 Semidef Opt

    23/54

    which isequivalentto:

    MIP: maximize det(R1)R, zs.t. ER,z

    {x| (ai)Tx bi}, i = 1, . . . , k

    R 0,

    which

    is

    equivalent

    to:

    M IP: maximize ln(det(R1))R, z

    Ts.t. maxx{ai

    x | (x z)TR(x z) 1} bi, i = 1, . . . , k

    R 0.

    For a given i = 1, . . . , k, the solution to the optimization problem in theith constraint is

    R1aix =z+Tai R

    1ai

    withoptimalobjectivefunctionvalue

    Tai

    z+ aTi

    R1ai

    ,

    andsoMIP canberewrittenas:

    23

  • 7/26/2019 Lec23 Semidef Opt

    24/54

    M IP: maximize ln(det(R1))R, z

    Ts.t. ai z+ aTR1ai

    bi, i = 1, . . . , ki

    R0.

    Now factorR1 =M2 where M 0(that is, M isa squarerootofR1),andnowM IPbecomes:

    M IP: maximize ln(det(M2))M, z

    T Ts.t. ai z+ ai

    MTMai

    bi, i = 1, . . . , k

    M0,

    whichwecanre-writeas:

    M IP

    :

    maximize

    2

    ln(det(M))

    M, zT Ts.t. ai

    MTM ai (biai

    z)2, i = 1, . . . , k

    Tbiai z0, i = 1, . . . , k

    M0.

    Nextnoticetheequivalence:

    T

    (bi

    ai z)I Mai

    aiTMTM ai

    (bi

    ai z)2 T 0 andT(Mai)T (bi

    ai

    z) T bi

    ai

    z0. 24

  • 7/26/2019 Lec23 Semidef Opt

    25/54

    InthiswaywecanwriteM IPas:

    M IP: minimize 2ln(det(M))M, z

    s.t.(bi

    aTi z)I(M ai)

    T

    M ai

    (bi

    aTi

    z)0, i = 1, . . . , k ,

    M0.

    Noticethatthis lastprogram involvessemidefiniteconstraints whereallofthematrixcoefficientsare linear functionsofthevariablesM andz. How-ever, theobjective function isnot a linear function. It ispossible, butnotnecessaryinpractice,toconvertthisproblemfurtherintoagenuineinstanceofSDP,becausethere isawaytoconvertconstraintsoftheform

    ln(det(X))

    to

    a

    semidefinite

    system.

    Such

    a

    conversion

    is

    not

    necessary,

    either

    from

    atheoreticalorapracticalviewpoint,becauseitturnsoutthatthefunctionf(X) = ln(det(X))isextremelywell-behavedandisveryeasytooptimize(both intheoryand inpractice).

    Finally, note that after solving the formulation of M IP above, we canrecoverthematrixRoftheoptimalellipsoidbycomputing

    R=M2.

    25

  • 7/26/2019 Lec23 Semidef Opt

    26/54

    8.5 SDP for Eigenvalue Optimization

    Therearemanytypesofeigenvalueoptimizationproblemsthatcanbefor-mualatedasSDPs. Atypicaleigenvalueoptimizationproblem istocreateamatrix

    k

    S :=B wiAii=1

    givensymmetricdatamatricesBandAi, i = 1, . . . , k,usingweightsw1, . . . , wk,

    in

    such

    a

    way

    to

    minimize

    the

    difference

    between

    the

    largest

    and

    smallest

    eigenvalueofS. Thisproblemcanbewrittendownas:

    EOP : minimize max(S)min(S)w,S

    k

    s.t. S=B wiAi,i=1

    where min(S) and max(S)denotethe smallestand the largesteigenvalue

    of

    S,

    respectively.

    We

    now

    show

    how

    to

    convert

    this

    problem

    into

    an

    SDP.

    RecallthatS canbefactored intoS=QDQT whereQ isanorthonor-mal matrix (i.e., Q1 =QT) and D is a diagonal matrix consisting of theeigenvaluesofS. Theconditions:

    ISI

    can

    be

    rewritten

    as:

    Q(I)QT QDQT Q(I)QT.

    26

  • 7/26/2019 Lec23 Semidef Opt

    27/54

    AfterpremultiplyingtheaboveQT andpostmultiplyingQ,theseconditionsbecome:

    IDI

    whichareequivalentto:

    min(S) and max(S).

    ThereforeEOP canbewrittenas:

    EOP : minimize w,S,,

    k

    s.t. S=B wiAi

    i=1

    ISI.

    This lastproblemisasemidefiniteprogram.

    Using constructs such as those shown above, very many other types ofeigenvalueoptimizationproblemscanbeformulatedasSDPs.

    27

  • 7/26/2019 Lec23 Semidef Opt

    28/54

    9 SDP in Control Theory

    AvarietyofcontrolandsystemproblemscanbecastandsolvedasinstancesofSDP. However,thistopic isbeyondthescopeofthesenotes.

    10 Interior-point Methods for SDP

    At the heart of an interior-point method is a barrier function that exertsa repelling force from the boundary of the feasible region. For SDP, we need a barrier function whose values approach +

    as points X approach

    the

    boundary

    of

    the

    semidefinite

    cone

    Sn

    +.

    Let X Sn+. Then X will have n eigenvalues, say 1(X), . . . , n(X)(possibly counting multiplicities). We can characterize the interior of thesemidefiniteconeasfollows:

    intSn ={XSn |1(X)>0, . . . , n(X)>0}.+

    A natural barrier function to use to repel X from the boundary of Sn+then is

    n n

    ln(i(X))=ln( i(X))=ln(det(X)).j=1 j=1

    Consider the logarithmic barrier problem BSDP() parameterized bythepositivebarrierparameter:

    28

  • 7/26/2019 Lec23 Semidef Opt

    29/54

    BSDP

    ()

    :

    minimize

    C

    X

    ln(det(X))

    s.t. AiX=bi , i = 1, . . . , m , X0.

    Let f

    (X) denote the objective function of BSDP(). Then it is not toodifficulttoderive:

    f(X) =

    C

    X1,

    (1)

    andsotheKarush-Kuhn-TuckerconditionsforBSDP()are:

    Ai

    X=bi

    , i = 1, . . . , m , X0, (2)

    m CX1 = y

    iA

    i.

    i=1

    Because X is symmetric, we can factorize X into X = LLT. We thencandefine

    S=X1 =LTL1,

    which implies

    1LTSL =I,

    29

  • 7/26/2019 Lec23 Semidef Opt

    30/54

    andwecanrewritetheKarush-Kuhn-Tuckerconditionsas:

    Ai

    X=bi

    , i = 1, . . . , m , X0, X=LLT

    (3)m yiAi

    +S=C i=1

    I1LTSL = 0.

    Fromtheequationsof(3)itfollowsthatif(X,y,S)isasolutionof(3),thenX isfeasibleforSDP, (y,S) is feasible for SDD ,andtheresultingdualitygapis

    n n n n

    SX= Sij

    Xij

    = (SX)jj

    = =n.i=1 j=1 j=1 j=1

    This suggests that we try solving BSDP() for a variety of values of as0.

    However,wecannotusuallysolve(3)exactly,becausethefourthequationgroupisnotlinearinthevariables. Wewillinsteaddefinea-approximatesolution of the Karush-Kuhn-Tucker conditions (3). Before doing so, weintroducethefollowingnormonmatrices,calledtheFrobeniusnorm:

    n n

    M

    := M

    M = M2 .ij

    i=1 j=1

    ForsomeimportantpropertiesoftheFrobeniusnorm,seethelastsubsection

    30

  • 7/26/2019 Lec23 Semidef Opt

    31/54

    ofthis section. A-approximatesolutionofBSDP() is defined as any solution

    (X,

    y,

    S) of

    Ai X=bi , i = 1, . . . , m ,

    X 0, X=LLT

    (4)m yiAi

    +S=C i=1

    I 1LTSL .

    X, Lemma 10.1 If ( y, S) is a -approximate solution of BSDP() and < 1, then X is feasible for SDP, ( y, S) is feasible for SDD, and theduality gap satisfies:

    m

    n(1 ) C X yibi

    =X S n(1+). (5)i=1

    Proof: Primal

    feasibility

    is

    obvious.

    To

    prove

    dual

    feasibility,

    we

    needtoshowthatS 0. Toseethis,define

    1LT R=I SL (6)

    andnotethatR < 1. Rearranging(6),weobtain L1 0S=LT(I R)

    XS) =becauseR

  • 7/26/2019 Lec23 Semidef Opt

    32/54

    q.e.d.

    10.1 The Algorithm

    Based on the analysisjust presented, we are motivated to develop the fol-lowingalgorithm:

    Step 0 . Initialization. Data is (X0, y0, S0, 0). k = 0. Assume that(X0, y0, S0) is a -approximatesolutionofBSDP(0)forsomeknownvalue

    of

    that

    satisfies

    4

    .

    Then for all [0, 1),

    2

    LT X) +

    0 L1D LT +2(1 ) .f0 X+L1D D f0(

    In particular,

    0.2

    LT X) 0.025

    0.

    (11)f0 X+L1D D f0(

    Inordertoprovethisproposition,wewillneedtwopowerfulfactsaboutthelogarithmfunction:

    Fact 1. Supposethat |x| < 1. Then2x

    ln(1+x) x .2(1

    )

    Proof: We have:3 4

    ln(1+x) = x x2 +x x +. . .2 3 4

    x |x|2 |x|3 |x|4 . . .2 3 4

    x |x|2 |x|3 |x|4 . . .2 2 22

    = x x 1 + |x| +|x|2 +|x|3 +. . .22

    =

    x

    x

    2(1|x|)

    2x x 2(1) .

    44

  • 7/26/2019 Lec23 Semidef Opt

    45/54

    q.e.d.

    Fact 2. SupposethatR Sn andthatR < 1. Then

    2ln(det(I+R)) I R .

    2(1 )

    Proof: Factor R = QDQT where Q is orthonormal and D is a diagonaln

    matrix of the eigenvalues ofR. Then first note thatR = D2 . We jjthen

    have: ln(det(I+R)) =

    =

    =

    =

    =

    q.e.d.

    Proof of Proposition 10.2. Let

    j=1

    ln(det(I+QDQT))ln(det(I+D))

    n

    ln(1+Djj )j=1

    n D2

    jjDjj 2(1)j=1

    R2I D 2(1)2I QTRQ 2(1)

    2I R 2(1)

    =

    LT ,L1D

    andnoticethatL1D LT =.

    45

  • 7/26/2019 Lec23 Semidef Opt

    46/54

    Then

    f0 X+

    D = f0 X+DL1DLT

    L(I+L1D LT))= CX+CD 0 ln(det( LT)

    = CX0 ln(det( X))+CD 0 ln(det(I+L1D LT)

    X) + CD 0IL1D 2(1) f0( LT +0 2

    2= f0( L1X) + CD 0LT D +0 2(1)

    2X) + C0 = f0( X1 D +0 2(1) .

    Now,(D

    , y )solvetheNormalequations:

    X1D

    X1

    C0 X1 +0 = m yi

    Ai

    i=1 (12)

    Ai

    D = 0, i = 1, . . . , m .

    Takingthe innerproductofbothsidesofthefirstequationabovewithD

    andrearrangingyields:

    X1D

    0 X1D =(C0 X1)D.

    Substitutingthis inour inequalityaboveyields:

    46

  • 7/26/2019 Lec23 Semidef Opt

    47/54

    X1D 2f0 X+

    D f0( X)0 X1D +0 2(1)L1DLT

    X)0 LT L1D 2(1)= f0( L1D LT +0

    2

    LT2 +0 2= f0(X)0L1D 2(1)

    LT+0 2= f0(X)0L1D 2(1)2= f0(X) +

    0

    L1D

    LT

    +2(1) .

    1Subsituting= 0.2andandL1DLT> 4 yieldsthefinalresult.q.e.d.

    Lastofall,weproveaboundonthenumberofiterationsthatthealgo-rithmwillneed inordertofinda 14 -approximatesolutionofBSDP(

    0):

    Proposition 10.3 Suppose that X0 satisfies Ai

    X0 =bi, i = 1, . . . , m, andX0

    0. Let 0 be given and let f0 be the optimal objective function value

    of BSDP

    (0). Then the algorithm initiated at X0 will find a 14 -approximatesolution of BSDP(0) in at most

    f0(X0)f0k=

    0.0250

    iterations.

    Proof: Thisfollows immediatelyfrom(11). Eachiterationthat isnota 1 -4approximatesolutiondecreasestheobjectivefunctionf0(X) of BSDP(

    0)byat least0.0250. Therefore,therecannotbemorethan

    f0(X0)f0

    0.0250

    47

  • 7/26/2019 Lec23 Semidef Opt

    48/54

    iterationsthatarenot 14

    -approximatesolutionsofBSDP(0).q.e.d.

    10.5 Some Properties of the Frobenius Norm

    Proposition 10.4 If M Sn, then

    n

    1. M = (j(M))2, where 1(M), 2(M), . . . , n(M) is an enu-j=1

    meration of the neigenvalues of M.

    2. If is any eigenvalue of M, then || M.

    3. |trace(M)| nM.4. If M

    0,

    and

    so

    M

    0.

    q.e.d.

    48

  • 7/26/2019 Lec23 Semidef Opt

    49/54

    Proposition 10.5 If A,B

    Sn, then

    AB

    A

    B.

    Proof: We have 2

    n n

    n

    AB = AikBkji=1 j=1 k=1

    n

    n n

    n

    A2 B2 ik

    kj

    i=1 j=1 k=1 k=1

    n n n n

    = A2 B2 =AB.ik kji=1 k=1 j=1 k=1q.e.d.

    11 Issues in Solving SDP using the Ellipsoid Al

    gorithm

    Toseehowtheellipsoidalgorithm isusedtosolveasemidefiniteprogram,

    assume

    for

    convenience

    that

    the

    format

    of

    the

    problem

    is

    that

    of

    the

    dual

    problemSDD. Thenthefeasibleregionoftheproblemcanbewrittenas:

    m

    mF = (y1, . . . , ym) |C yiAi 0 ,i=1

    mand the objective function is Note that F isjust a convex re-i=1 yibi.mgion in .

    Recallthatatanyiterationoftheellipsoidalgorithm,thesetofsolutionsofSDDisknowntolieinthecurrentellipsoid,andthecenterofthecurrent

    49

  • 7/26/2019 Lec23 Semidef Opt

    50/54

    y= ( y

    F, thenweperformanoptimalityellipsoid is,say, y1

    , . . . , ym

    ). If m mcut

    of

    the

    form i=1 yibi i=1 yibi,andusestandardformulastoupdate

    theellipsoidand itsnewcenter. If F,thenweperformafeasibilitycuty /h m suchthatbycomputingavector hTy > hTy forally F.

    There are four issues that must be resolved in order to implement theaboveversionoftheellipsoidalgorithmtosolveSDD:

    1. Testing if y F. This is done by computing the matrix S = Cm

    yiAi. If S

    0, then y

    F. Testing if the matrix S

    0 can i=1

    be

    done

    by

    computing

    an

    exact

    Cholesky

    factorization

    of

    S,

    which

    takesO(n3)operations,assumingthatcomputingsquarerootscanbeperformed(exactly)inoneoperation.

    2. Computing a feasibility cut. As above, testing if S 0 can be com-putedinO(n3)operations. IfS 0,thenagainassumingexactarith-metic, we can find an n-vector v Sv such that T v < 0 in O(n3) oper-ations as well. Then the feasiblity cut vector h is computed by theformula:

    v

    hi

    =

    T

    Aiv,

    i

    = 1, . . . , m ,

    whose computation requires O(mn2)operations. Noticethat for anyy F,that

    m mT

    Tvy h= yivT

    Aiv = yiAi

    vi=1 i=1

    m m

    Tv < v yiAi v = yivT

    Aiv = T

    vTC y h,i=1 i=1

    therebyshowing y.h indeedprovidesafeasibilitycutforF aty=

    3. Startingtheellipsoidalgorithm. WeneedtodetermineanupperboundR on the distance of some optimal solution y from the origin. This

    50

  • 7/26/2019 Lec23 Semidef Opt

    51/54

    cannot bedonebyexaminingtheinputlengthofthedata,asisthecasein

    linear

    programming.

    One

    needs

    to

    know

    some

    special

    information

    about the specific problem at hand in order to determine R beforesolvingthesemidefiniteprogram.

    4. Stopping the ellipsoid algorithm. Suppose that we seek an-optimalsolutionofSDD. Inordertoproveacomplexityboundonthenumberof iterations needed to find an -optimal solution, we need to knowbeforehand the radius r of a Euclidean ball that is contained in theset of -optimal solutions of SDD. The value of r also cannot bedeterminedbyexamining the input lengthofthedata, as isthe casein linearprogramming. One needstoknow some special information

    about

    the

    specific

    problem

    at

    hand

    in

    order

    to

    determine

    r

    beforesolvingthesemidefiniteprogram.

    12 Current Research in SDP

    There are many very active research areas in semidefinite programming innonlinear(convex)programming,incombinatorialoptimization,andincon-trol theory. In the area of convex analysis, recent research topics includethegeometryandtheboundarystructureofSDPfeasibleregions(includingnotionsofdegeneracy)andresearchrelatedtothecomputationalcomplex-

    ity

    of

    SDP

    such

    as

    decidability

    questions,

    certificates

    of

    infeasibility,

    and

    duality theory. In the area of combinatorial optimization, there has beenmuchresearchonthepracticalandthetheoreticaluseofSDPrelaxationsofhardcombinatorialoptimizationproblems. Asregards interiorpointmeth-ods, there are a host of research issues, mostly involving the developmentofdifferentinteriorpointalgorithmsandtheirproperties,includingratesofconvergence,performanceguarantees,etc.

    13 Computational State of the Art of SDP

    Because

    SDP

    has

    so

    many

    applications,

    and

    because

    interior

    point

    methods

    showsomuchpromise,perhapsthemostexcitingareaofresearchonSDPhastodowithcomputationandimplementationofinteriorpointalgorithms

    51

  • 7/26/2019 Lec23 Semidef Opt

    52/54

    for solving SDP. Much research has focused on the practical efficiency ofinterior

    point

    methods

    for

    SDP

    .

    However,

    in

    the

    research

    to

    date,

    com-putational issues have arisen that are much more complex than those forlinear programming, and these computational issues are only beginning tobewell-understood. Theyprobablystemfromavarietyoffactors,includingthefactthatSDP isnotguaranteedtohavestrictlycomplementaryoptimalsolutions (as is the case in linear programming). Finally, because SDP issuch a new field, there is no representative suite of practical problems onwhich to test algorithms, i.e., there is no equivalent version of thenetlibsuiteof industrial linearprogrammingproblems.

    Agoodwebsiteforsemidefiniteprogramming is:

    http://www.zib.de/helmberg/semidef.html.

    14 Exercises

    n

    1. Fora(square)matrixM IRnn,definetrace(M) = Mjj ,andforj=1

    twomatricesA,B

    IRkl define

    k l

    A B:= AijBij .i=1 j=1

    Provethat:

    (a) A B=trace(ATB).(b) trace(M N)=trace(N M).

    2. LetSkk denotetheconeofpositivesemi-definitesymmetricmatrices,+nnamelySkk ={X Skk | vTXv0 forallv }. Considering+

    SkkSkk as

    a

    cone,

    prove

    that

    =

    Snn,

    thus

    showing

    that

    Skk+ + + +isself-dual.

    52

  • 7/26/2019 Lec23 Semidef Opt

    53/54

    3. Considertheproblem:

    (P): minimized dTQd

    s.t. dTM d = 1 .

    IfM0,showthat(P) isequivalentto:(S): minimizeX QX

    s.t. MX= 1

    X

    0 .

    What istheSDPdualof(S)?

    4. ProvethatXxxT

    ifandonly if

    X x 0.xT 1

    5. LetK :={XSkk |X0}anddefine

    K

    :=

    {S

    Skk

    |

    S

    X

    0

    for

    all

    X

    0}

    .

    ProvethatK =K.

    6. Let min denote the smallest eigenvalue of the symmetric matrix Q.Showthatthefollowingthreeoptimizationproblemseachhaveoptimalobjectivefunctionvalueequaltomin.

    (P1): minimized

    dTQd

    s.t. dTId = 1 .

    (P2)

    :

    maximize

    s.t. QI .

    53

  • 7/26/2019 Lec23 Semidef Opt

    54/54

    (P3): minimizeX

    Q X

    s.t. IX= 1 X0 .

    7. SupposethatQ0. Provethefollowing:

    Tx Qx +qTx +c0

    ifandonly ifthereexistsW forwhich

    12q

    TT 1 xT 1 xc 0 and 0.1

    2q Q x W x W

    Hint: usethepropertyshown inExercise4.

    8. Considerthematrix

    A BM = ,

    BT C

    whereA, B aresymmetricmatricesandA isnonsingular. ProvethatM0ifandonly ifCBTA1B0.