64
From Markov chains to Multivariable digraphs Yaokun Wu (~n) Shanghai Jiao Tong University G2M2 Yekaterinburg, July 28, 2017

From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

From Markov chains to Multivariabledigraphs

Yaokun Wu (~�n)Shanghai Jiao Tong University

G2M2Yekaterinburg, July 28, 2017

Page 2: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

This talk is mainly based on:

• Yaokun Wu, Zeying Xu (ML¤), Yinfeng Zhu (6Û¸), Anexpansion property of Boolean linear maps, The ElectronicJournal of Linear Algebra, 31 (2016) 381–407.

• Yaokun Wu, Zeying Xu, Yinfeng Zhu, Strongly connectedmultivariate digraphs, The Electronic Journal ofCombinatorics, 24(1) (2017) #P1.47, 44 pp.

• Zongchen Chen (�m�), Yaokun Wu, An expansionproperty of Boolean multilinear maps, preprint.

• Chengyang Qian (a§�), Yaokun Wu, Yinfeng Zhu, Somenonexistence results for cyclic decompositions, preprint.

• ...

2 / 34

Page 3: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Markov chain

Figure: Andrey Markov,14 June 1856 – 20 July1922 https://en.wikipedia.org/wiki/Andrey_Markov

A Markov chain is a sequence ofpossible events in which theprobability of every event relieslinearly on the probability of thestates attained in the previousevent.

More generally, for any positiveinteger t, an order-t Markov chainis a sequence of possible events inwhich the probability of each eventdepends multi-linearly on theprobability of the states attained inthe previous t events.

3 / 34

Page 4: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Markov chain, Contd.

Let E = (E1, E2, E3, . . .) be a sequence of random variableswhich take values in a finite state space X. We call E an order-tMarkov chain if it holds

Pr(Em = x : Em−1 = xt, Em−2 = xt−1, . . . , Em−t = x1)

, Ax,xt ,xt−1,...,x1 ,

for all m ≥ t + 1.

For the (t + 1)-dimensional array (tensor) A to represent aprobability transition, we surely have A ≥ 0 and∑

x∈XAx,xt ,xt−1,...,x1 = 1 for all xt, xt−1, . . . , x1 ∈ X.

4 / 34

Page 5: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Markov chain, again

• A finite set: X

• The Kronecker delta function at x ∈ X: δx ∈ RX

• The probability simplex on X:∆X = {p ∈ RX : p ≥ 0,

∑x∈X p(x) = 1}, namely the convex hull

of {δx : x ∈ X}.

• An order-t Markov chain on X: a multilinear mapf : RX × · · · × RX︸ ︷︷ ︸

t

→ RX such that f maps ∆X × · · · × ∆X︸ ︷︷ ︸t

into

∆X.

5 / 34

Page 6: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Markov chain, again, Contd.

• Markov operator for f : the map M f which sends(p(1) ⊗ p(2) ⊗ · · · ⊗ p(t)) ∈ RX × · · · × RX︸ ︷︷ ︸

t

to

p(2) ⊗ p(3) ⊗ . . . ⊗ p(t) ⊗ f (p(1), p(2), . . . , p(t))

• To understand the higher order Markov chain f is to studythe dynamical behaviours of the t-linear map M f from∆X × · · · × ∆X︸ ︷︷ ︸

t

to itself.

• The adjacency tensor for f : A( f ) ∈ RX×···×X as specified byf (δx1 , . . . , δxt ) =

∑xt+1∈XA( f )xt+1,...,x1δxt+1

• The De Bruijn form of f : the weighted digraph Γ f withvertex set Xt and the set of arcs x1 · · · xt → x2 · · · xt+1 withweight A( f )xt+1,...,x1 .

6 / 34

Page 7: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

JOURNAL OF ALGEBRAIC STATISTICSVol. 3, No. 1, 2012, 1-10ISSN 1309-3452 – www.jalgstat.com

Geometry of Higher-Order Markov Chains

Bernd Sturmfels

Department of Mathematics, University of California at Berkeley, Berkeley, CA 94720, USA

Abstract. We determine an explicit Grobner basis, consisting of linear forms and determinantalquadrics, for the prime ideal of Raftery’s mixture transition distribution model for Markov chains.When the states are binary, the corresponding projective variety is a linear space, the model itselfconsists of two simplices in a cross-polytope, and the likelihood function typically has two localmaxima. In the general non-binary case, the model corresponds to a cone over a Segre variety.

2000 Mathematics Subject Classifications: Primary 60J10, 13P25; Secondary 62H05

Key Words and Phrases: Markov chains, Grobner bases, Maximum likelihood

1. Introduction

In this note we investigate Adrian Raftery’s mixture transition distribution model(MTD) from the perspective of algebraic statistics [4, 8]. The MTD model, which wasfirst proposed in [9], has a wide range of applications in engineering and the sciences [10].The article by Berchtold and Raftery [2] offers a detailed introduction and review.

The point of departure for this project was a conjecture due to Donald Richards [11],stating that the likelihood function of an MTD model can have multiple local maxima.We establish this conjecture for the case of binary states in Proposition 1.

Our main result, to be derived in Section 4, gives an explicit Grobner basis for theMTD model. Here, both the sequence length and the number of states are arbitrary.

We begin with an algebraic description of the model in [2, 9]. Fix a pair of positiveintegers l and m, and set N = ml+1 − 1. We define the statistical model MTDl,m whosestate space is the set [m]l+1 of sequences i0i1 · · · il of length l+ 1 over the alphabet [m] ={1, 2, . . . ,m}. The model has (m − 1)m + l − 1 parameters, given by the entries of anm ×m-transition matrix (qij) and a probability distribution λ = (λ1, . . . , λl) on the set[l] = {1, 2, . . . , l} of the hidden states. Thus the parameter space is the product of simplices(∆m−1)m ×∆l−1. The model MTDl,m will be a semialgebraic subset of the simplex ∆N .That simplex has its coordinates pi0i1···il indexed by sequences in [m]l+1.

The model MTDl,m is the image of the bilinear map

φl,m : (∆m−1)m ×∆l−1 → ∆N

Email addresses: [email protected] (B. Sturmfels)

http://www.jalgstat.com/ 1 c© 2012 JAlgStat All rights reserved.

For the order-t Markov chain f , M f is a linear map when t = 1,while M f is in general not any linear map when t > 1!

When t = 1, the dynamical behaviour of M f is quite clear; Whent > 1, the dynamical behaviour of M f is far from understood.

7 / 34

Page 8: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

A simple sample question

QuestionLet f be an order-t Markov chainon a finite set X and letΨ ∈ ∆X × · · · × ∆X︸ ︷︷ ︸

t

. We call

Φ ∈ ∆X × · · · × ∆X︸ ︷︷ ︸t

a history of Ψ in f

provided there exists a positiveinteger n such that Mn

f (Φ) = Ψ.

Is it true that Ψ can only have afinite number of histories?

Figure: http://simplecoffee.ru/ourstory.html

8 / 34

Page 9: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

A simple sample question, Contd.

When t = 1, the question has a positive answer: Every Markovchain must have a beginning (Loo-Keng Hua (uÛ�), 1984).This also explains that there only exist one-sided Markovchains.

This is claimed to be a result in the folklore (Kai Lai Chung (¨m4), Chance & Choice, World Scientific, 2004).

9 / 34

Page 10: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

A simple sample question, Contd.

When t = 1, the question has a positive answer: Every Markovchain must have a beginning (Loo-Keng Hua (uÛ�), 1984).This also explains that there only exist one-sided Markovchains.

This is claimed to be a result in the folklore (Kai Lai Chung (¨m4), Chance & Choice, World Scientific, 2004).

9 / 34

Page 11: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Inhomogeneous Markov chainLet f and g be two multilinear maps from RX × · · · × RX︸ ︷︷ ︸

t

to RX.

M f Mg is surely still a t-multilinear map:

(M f ◦Mg)(δx1 , . . . , δxt )

= M f(δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)=

(δx3 , . . . , δxt , g(δx1 , . . . , δxt ), f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)The product of A( f ) and A(g) is defined to correspond to thet-linear map which sends (δx1 , . . . , δxt ) to(δx2 , . . . , δxt−1 , f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

).(

A( f )A(g))

xt+2,xt ,...,x1�

∑xt+1∈XA( f )xt+2,xt+1,...,x2A(g)xt+1,xt ,...,x1

Flow of time in the inhomogeneous higher order Markov chaindefines the product of tensors!

10 / 34

Page 12: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Inhomogeneous Markov chainLet f and g be two multilinear maps from RX × · · · × RX︸ ︷︷ ︸

t

to RX.

M f Mg is surely still a t-multilinear map:

(M f ◦Mg)(δx1 , . . . , δxt )

= M f(δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)=

(δx3 , . . . , δxt , g(δx1 , . . . , δxt ), f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)

The product of A( f ) and A(g) is defined to correspond to thet-linear map which sends (δx1 , . . . , δxt ) to(δx2 , . . . , δxt−1 , f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

).(

A( f )A(g))

xt+2,xt ,...,x1�

∑xt+1∈XA( f )xt+2,xt+1,...,x2A(g)xt+1,xt ,...,x1

Flow of time in the inhomogeneous higher order Markov chaindefines the product of tensors!

10 / 34

Page 13: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Inhomogeneous Markov chainLet f and g be two multilinear maps from RX × · · · × RX︸ ︷︷ ︸

t

to RX.

M f Mg is surely still a t-multilinear map:

(M f ◦Mg)(δx1 , . . . , δxt )

= M f(δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)=

(δx3 , . . . , δxt , g(δx1 , . . . , δxt ), f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)The product of A( f ) and A(g) is defined to correspond to thet-linear map which sends (δx1 , . . . , δxt ) to(δx2 , . . . , δxt−1 , f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

).(

A( f )A(g))

xt+2,xt ,...,x1�

∑xt+1∈XA( f )xt+2,xt+1,...,x2A(g)xt+1,xt ,...,x1

Flow of time in the inhomogeneous higher order Markov chaindefines the product of tensors!

10 / 34

Page 14: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Inhomogeneous Markov chainLet f and g be two multilinear maps from RX × · · · × RX︸ ︷︷ ︸

t

to RX.

M f Mg is surely still a t-multilinear map:

(M f ◦Mg)(δx1 , . . . , δxt )

= M f(δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)=

(δx3 , . . . , δxt , g(δx1 , . . . , δxt ), f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)The product of A( f ) and A(g) is defined to correspond to thet-linear map which sends (δx1 , . . . , δxt ) to(δx2 , . . . , δxt−1 , f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

).(

A( f )A(g))

xt+2,xt ,...,x1�

∑xt+1∈XA( f )xt+2,xt+1,...,x2A(g)xt+1,xt ,...,x1

Flow of time in the inhomogeneous higher order Markov chaindefines the product of tensors!

10 / 34

Page 15: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Weighted multivariable digraph and itsadjacency tensor

Let R be any semiring, say Boolean semiring, tropical semiring,binary field, complex field, ....

An R-weighted t-variable digraph on a set K is a map fromt︷ ︸︸ ︷

K × · · · × K to RK .

When R is the Boolean semiring (thus RK can be identified withthe power set 2K) and t = 1, an R-weighted t-variable digraph isa usual digraph.

Each R-weighted t-variable digraph f on K corresponds to its

adjacency tensor A( f ) ∈ R

t+1︷ ︸︸ ︷K × · · · × K :

f (k1, . . . , kt) =∑

kt+1∈K

A( f )kt+1,...,k1δkt+1

11 / 34

Page 16: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Weighted multivariable digraph and itsadjacency tensor

Let R be any semiring, say Boolean semiring, tropical semiring,binary field, complex field, ....

An R-weighted t-variable digraph on a set K is a map fromt︷ ︸︸ ︷

K × · · · × K to RK .

When R is the Boolean semiring (thus RK can be identified withthe power set 2K) and t = 1, an R-weighted t-variable digraph isa usual digraph.

Each R-weighted t-variable digraph f on K corresponds to its

adjacency tensor A( f ) ∈ R

t+1︷ ︸︸ ︷K × · · · × K :

f (k1, . . . , kt) =∑

kt+1∈K

A( f )kt+1,...,k1δkt+1

11 / 34

Page 17: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Weighted multivariable digraph and itsadjacency tensor

Let R be any semiring, say Boolean semiring, tropical semiring,binary field, complex field, ....

An R-weighted t-variable digraph on a set K is a map fromt︷ ︸︸ ︷

K × · · · × K to RK .

When R is the Boolean semiring (thus RK can be identified withthe power set 2K) and t = 1, an R-weighted t-variable digraph isa usual digraph.

Each R-weighted t-variable digraph f on K corresponds to its

adjacency tensor A( f ) ∈ R

t+1︷ ︸︸ ︷K × · · · × K :

f (k1, . . . , kt) =∑

kt+1∈K

A( f )kt+1,...,k1δkt+1

11 / 34

Page 18: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Weighted multivariable digraph and itsadjacency tensor

Let R be any semiring, say Boolean semiring, tropical semiring,binary field, complex field, ....

An R-weighted t-variable digraph on a set K is a map fromt︷ ︸︸ ︷

K × · · · × K to RK .

When R is the Boolean semiring (thus RK can be identified withthe power set 2K) and t = 1, an R-weighted t-variable digraph isa usual digraph.

Each R-weighted t-variable digraph f on K corresponds to its

adjacency tensor A( f ) ∈ R

t+1︷ ︸︸ ︷K × · · · × K :

f (k1, . . . , kt) =∑

kt+1∈K

A( f )kt+1,...,k1δkt+1

11 / 34

Page 19: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Multivalued binary operations intropical geometry and hypergroup

Figure: Oleg Virohttp://www.math.stonybrook.edu/~oleg/

Multivalued binary operations:K × K to 2K \ {∅}

I Oleg Viro, On basic concepts oftropical geometry, Proceedings ofthe Steklov Institute ofMathematics 273 (2011), 252–282.

I V. M. Buchstaber, n-valuedgroups: theory and applications,Moscow Mathematical Journal 6(2006), 57–84.

12 / 34

Page 20: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Tensor multiplication

For any two (t + 1)-fold R-weighted tensors B,C ∈ R

t+1︷ ︸︸ ︷K × · · · × K ,

we define the product of B and C to be the (t + 1)-foldR-weighted tensor, which is denoted by BC and sends(k1, . . . , kt+1) ∈ Kt+1 to

BC(k1, . . . , kt+1) =∑k∈K

B(k1, k, k2, . . . , kt)C(k, k2, . . . , kt+1).

This multiplication is essentially the same with the tensormultiplication defined by Grone.

I Robert Grone, Markov chains and tensor multiplications,Journal of Algebra, 109 (1987) 14–24.

13 / 34

Page 21: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Tensor multiplication

For any two (t + 1)-fold R-weighted tensors B,C ∈ R

t+1︷ ︸︸ ︷K × · · · × K ,

we define the product of B and C to be the (t + 1)-foldR-weighted tensor, which is denoted by BC and sends(k1, . . . , kt+1) ∈ Kt+1 to

BC(k1, . . . , kt+1) =∑k∈K

B(k1, k, k2, . . . , kt)C(k, k2, . . . , kt+1).

This multiplication is essentially the same with the tensormultiplication defined by Grone.

I Robert Grone, Markov chains and tensor multiplications,Journal of Algebra, 109 (1987) 14–24.

13 / 34

Page 22: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Tensor algebra

• The set of (t + 1)-fold R-weighted tensors with the previousmultiplication rule form the tensor algebra A(t + 1,R,K).

• The map f that sends (k1, . . . , kt+1) ∈ Kt+1 to δk1,k2 is the leftidentity for the tensor multiplication in A(t + 1,R,K).

• Composition of maps here models the flow of time and soit is no surprise to find that A(t + 1,R,K) is generally not anassociative algebra.

QuestionWhat is the structure of A(t + 1,R,K)?

14 / 34

Page 23: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Boolean coefficient digraphs as monads

A monad is a mapping of afinite set into itself. – V. I.Arnold, The Topology ofAlgebra: Combinatorics ofSquaring, Functional Analysisand Its Applications 37 (2003)177–190.

The study of Boolean coefficientmultivariable digraphs is a study of somespecial monads.

Boris Z. Shapiro (1990)Maxim E. Kazarian (1991)Ernesto Rosales-Gonzalez (1991)Oleg G. Galkin (1992)Michael Z. Shapiro (1992)Alexander Kh. Rakhimov (1995)Francesca Aicardi (1996)Yuri V. Chekanov (1997)Emmanuel Ferrand (1997)Petr E. Pushkar (1998)Jacques-Olivier Moussafir (2000)Mauricio Garay (2001)Fabien Napolitano (2001)Ricardo Uribe-Vargas (2001)Mikhail B. Mishustin (2002)Adriana Ortiz-Rodriguez (2002)Gianmarco Capitanio (2004)Oleg N. Karpenkov (2005)Alexander M. Lukatsky (1975) (2006, DSci)

Alexander Givental

To Whom It May ConcernNo estь i Boжii sud ...

M. �. Lermontov, “Smertь Poзta”6

Posthumousmemoirs seemtohave the unintendedeffect of reducing the person’s life to a collection ofstories. For most of us it would probably be a justand welcome outcome, but for Vladimir Arnold,I think, it would not. He tried and managed totell us many different things about mathematics,education, and beyond, and in many cases we’vebeen rather slow listening or thinking, so I believewe will be returning again and again not only toour memories of him but to his own words aswell. What is found below is not a memoir, but arecommendation letter, albeit a weak one, for hedid not get the prize, and yet hopefully useful asan interim attempt to overview his mathematicalheritage.

January 25, 2005Dear Members of [the name of the committee],

You have requested my commentaries on the workof Vladimir Arnold. Writing them is an honorableand pleasurable task for me.

In the essence the task is easy:Yes, Vladimir Arnold fully merits [the name of

the prize] since his achievements are of extraordi-

nary depth and influence.His work indeed resolves fundamental prob-

lems, and introduces unifying principles, and

opens up major new areas, and (at least in some of

Alexander Givental is professor of mathematics at the

University of California, Berkeley. His email address is

[email protected], there is God’s Court, too. . ., M. Yu. Lermontov,“Death of Poet”.

At Dubna, 2006.

these areas) it introduces powerful new techniquestoo.

On the other hand, writing this letter is noteasy, mainly because the ways Arnold’s work con-tributes to our knowledge are numerous and go farbeyond my personal comprehension. As Arnold’sstudent, I am quite familiar with those aspects ofhis work which inspired my own research. Outsidethese areas, hopefully, I will be able to conveythe conventional wisdom about Arnold’s most fa-mous achievements. Yet this leaves out the oceanof numerous, possibly less famous but extremelyinfluential, contributions, of which I have onlypartial knowledge and understanding. So, I willhave to be selective here and will mention just ahandful of examples which I am better familiarwith and which for this reason may look chosenrandomly.

Perhaps the most legendary, so to speak, ofArnold’s contributions is his work on small

denominators,7 followed by the discovery ofArnold’s diffusion,8 and known now as part of theKolmogorov–Arnold–Moser theory. Among otherthings, this work contains an explanation (or, de-pending on the attitude, a proof, and a highlytechnical one) of stability of the solar planetarysystem. Even more importantly, the KAM theoryprovides a very deep insight into the real-world dy-namics (perhaps one of the few such insights so far,one more being stability of Anosov’s systems) and

7Small denominators III. Problems of stability of motionin classical and celestial mechanics, Uspekhi Mat. Nauk

18 (1963), no. 6, 91–192, following Small denominatorsI. Mappings of a circle onto itself, Izvestia AN SSSR, Ser.

Mat. 25 (1961), 21–86, Small denominators II. Proof of atheorem of A. N. Kolmogorov on the preservation of con-ditionally periodic motions under a small perturbationof the Hamiltonian, Uspekhi Mat. Nauk 18 (1963), no. 5,

13–40, and a series of announcements in DAN SSSR.8Instability of dynamical systems with many degrees offreedom, DAN SSSR 156 (1964), 9–12.

March 2012 Notices of the AMS 381

Figure: Vladimir Arnold(1937–2010) http://www.ams.org/notices/201203/rtx120300378p.pdf

15 / 34

Page 24: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Single-variable graph theory

Let K be a set. A single-variable digraph f , also known as atopological Markov chain, on K is a map from K to 2K .

It is most convenient to identify f with the digraph Γ f havingvertex set K and arc set {k → j : j ∈ f (k)}.

The map f induces a Boolean linear map M f from 2K to 2K ,called the Markov operator associated to f , such thatM f (A) = ∪k∈A f (k) for each A ∈ 2K .

The phase space of f , denoted by PS f , has vertex set 2K andarc set {A→ M f (A) : A ∈ 2K}. The structure of PS f tells us thedynamical behavior of M f .

16 / 34

Page 25: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Single-variable graph theory

Let K be a set. A single-variable digraph f , also known as atopological Markov chain, on K is a map from K to 2K .

It is most convenient to identify f with the digraph Γ f havingvertex set K and arc set {k → j : j ∈ f (k)}.

The map f induces a Boolean linear map M f from 2K to 2K ,called the Markov operator associated to f , such thatM f (A) = ∪k∈A f (k) for each A ∈ 2K .

The phase space of f , denoted by PS f , has vertex set 2K andarc set {A→ M f (A) : A ∈ 2K}. The structure of PS f tells us thedynamical behavior of M f .

16 / 34

Page 26: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Single-variable graph theory

Let K be a set. A single-variable digraph f , also known as atopological Markov chain, on K is a map from K to 2K .

It is most convenient to identify f with the digraph Γ f havingvertex set K and arc set {k → j : j ∈ f (k)}.

The map f induces a Boolean linear map M f from 2K to 2K ,called the Markov operator associated to f , such thatM f (A) = ∪k∈A f (k) for each A ∈ 2K .

The phase space of f , denoted by PS f , has vertex set 2K andarc set {A→ M f (A) : A ∈ 2K}. The structure of PS f tells us thedynamical behavior of M f .

16 / 34

Page 27: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Single-variable graph theory

Let K be a set. A single-variable digraph f , also known as atopological Markov chain, on K is a map from K to 2K .

It is most convenient to identify f with the digraph Γ f havingvertex set K and arc set {k → j : j ∈ f (k)}.

The map f induces a Boolean linear map M f from 2K to 2K ,called the Markov operator associated to f , such thatM f (A) = ∪k∈A f (k) for each A ∈ 2K .

The phase space of f , denoted by PS f , has vertex set 2K andarc set {A→ M f (A) : A ∈ 2K}. The structure of PS f tells us thedynamical behavior of M f .

16 / 34

Page 28: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

An example

A B

C D

D ∅ ABC ABCD

C A B CD BD

AD

BC ACD AB BCD AC

ABD

Figure: A digraph f on {A, B,C,D} and its phase space PS f .

The phase space consists of those limit cycles together withthe transient parts (in-trees attached to the limit cycles).

17 / 34

Page 29: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Multivariable graph theoryLet K be a set and t a positive integer. A digraph f on K in tvariables is a map from Kt to 2K .

It can be graphicallyrepresented by its De Bruijn form Γ f , a digraph with vertex setKt and arc set

{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.

By ignoring isolated vertices, the De Bruijn forms of Booleandigraphs are exactly Rauzy graphs in symbolic dynamics. Themap f induces a Boolean t-linear map M f from (2k)t to (2k)t,called the Markov operator associated to f , such that

M f (A1, . . . , At) =(A2, . . . , At,∪k1∈A1,...,kt∈At f (k1, . . . , kt)

).

The phase space of f , denoted by PS f , has vertex set (2K)t

and arc set {A→ M f (A) : A ∈ (2K)t}. The structure of PS f tellsus the dynamical behavior of M f .

18 / 34

Page 30: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Multivariable graph theoryLet K be a set and t a positive integer. A digraph f on K in tvariables is a map from Kt to 2K . It can be graphicallyrepresented by its De Bruijn form Γ f , a digraph with vertex setKt and arc set

{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.

By ignoring isolated vertices, the De Bruijn forms of Booleandigraphs are exactly Rauzy graphs in symbolic dynamics. Themap f induces a Boolean t-linear map M f from (2k)t to (2k)t,called the Markov operator associated to f , such that

M f (A1, . . . , At) =(A2, . . . , At,∪k1∈A1,...,kt∈At f (k1, . . . , kt)

).

The phase space of f , denoted by PS f , has vertex set (2K)t

and arc set {A→ M f (A) : A ∈ (2K)t}. The structure of PS f tellsus the dynamical behavior of M f .

18 / 34

Page 31: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Multivariable graph theoryLet K be a set and t a positive integer. A digraph f on K in tvariables is a map from Kt to 2K . It can be graphicallyrepresented by its De Bruijn form Γ f , a digraph with vertex setKt and arc set

{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.

By ignoring isolated vertices, the De Bruijn forms of Booleandigraphs are exactly Rauzy graphs in symbolic dynamics.

Themap f induces a Boolean t-linear map M f from (2k)t to (2k)t,called the Markov operator associated to f , such that

M f (A1, . . . , At) =(A2, . . . , At,∪k1∈A1,...,kt∈At f (k1, . . . , kt)

).

The phase space of f , denoted by PS f , has vertex set (2K)t

and arc set {A→ M f (A) : A ∈ (2K)t}. The structure of PS f tellsus the dynamical behavior of M f .

18 / 34

Page 32: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Multivariable graph theoryLet K be a set and t a positive integer. A digraph f on K in tvariables is a map from Kt to 2K . It can be graphicallyrepresented by its De Bruijn form Γ f , a digraph with vertex setKt and arc set

{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.

By ignoring isolated vertices, the De Bruijn forms of Booleandigraphs are exactly Rauzy graphs in symbolic dynamics. Themap f induces a Boolean t-linear map M f from (2k)t to (2k)t,called the Markov operator associated to f , such that

M f (A1, . . . , At) =(A2, . . . , At,∪k1∈A1,...,kt∈At f (k1, . . . , kt)

).

The phase space of f , denoted by PS f , has vertex set (2K)t

and arc set {A→ M f (A) : A ∈ (2K)t}. The structure of PS f tellsus the dynamical behavior of M f .

18 / 34

Page 33: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Multivariable graph theoryLet K be a set and t a positive integer. A digraph f on K in tvariables is a map from Kt to 2K . It can be graphicallyrepresented by its De Bruijn form Γ f , a digraph with vertex setKt and arc set

{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.

By ignoring isolated vertices, the De Bruijn forms of Booleandigraphs are exactly Rauzy graphs in symbolic dynamics. Themap f induces a Boolean t-linear map M f from (2k)t to (2k)t,called the Markov operator associated to f , such that

M f (A1, . . . , At) =(A2, . . . , At,∪k1∈A1,...,kt∈At f (k1, . . . , kt)

).

The phase space of f , denoted by PS f , has vertex set (2K)t

and arc set {A→ M f (A) : A ∈ (2K)t}. The structure of PS f tellsus the dynamical behavior of M f .

18 / 34

Page 34: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

An example

11 12

21 22

22

∅2 2∅ ∅∅

K∅∅K

1∅ ∅1

1121 12 2K K1 1K KK

K2

Figure: The De Bruijn form Γ f and the phase space PS f of a2-variable digraph f on K = [2] = {1, 2}.

19 / 34

Page 35: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Wave-particle dualityOne can view Γ f as a local dynamical mechanism and PS f asthe global evolving picture. The question is to see how to linkthe local with the global.

One can also think of Γ f as the particle version of f and PS f asthe wave version of f . Both versions encode full informationabout f in some way.

http://fossbytes.com/wp-content/uploads/2015/03/light-wave-particle-duality-.jpg

20 / 34

Page 36: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Wave-particle dualityOne can view Γ f as a local dynamical mechanism and PS f asthe global evolving picture. The question is to see how to linkthe local with the global.

One can also think of Γ f as the particle version of f and PS f asthe wave version of f . Both versions encode full informationabout f in some way.

http://fossbytes.com/wp-content/uploads/2015/03/light-wave-particle-duality-.jpg

20 / 34

Page 37: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Primitive digraphsFor A = (A1, . . . , At) ∈ (2K)t and B = (B1, . . . , Bt) ∈ (2K)t, we writeA � B provided Ai ⊆ Bi for i = 1, . . . , t.

This makes (2K)t a posetwith the maximum element (K, . . . ,K)︸ ︷︷ ︸

t

= Kt and the minimum

element ∅t. An element from (2K)t which has ∅ as one of its tcomponents is called a vacant element.

Let f be a t-variable digraph on K. It is clear that every vacantelement of (2K)t will reach ∅t in PS f eventually.

We call f primitive provided every element from Kt will reach Kt

in PS f , equivalently, every non-vacant element will go to thelargest element Kt in PS f and every vacant element will go tothe minimum one ∅t in PS f .

21 / 34

Page 38: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Primitive digraphsFor A = (A1, . . . , At) ∈ (2K)t and B = (B1, . . . , Bt) ∈ (2K)t, we writeA � B provided Ai ⊆ Bi for i = 1, . . . , t. This makes (2K)t a posetwith the maximum element (K, . . . ,K)︸ ︷︷ ︸

t

= Kt and the minimum

element ∅t. An element from (2K)t which has ∅ as one of its tcomponents is called a vacant element.

Let f be a t-variable digraph on K. It is clear that every vacantelement of (2K)t will reach ∅t in PS f eventually.

We call f primitive provided every element from Kt will reach Kt

in PS f , equivalently, every non-vacant element will go to thelargest element Kt in PS f and every vacant element will go tothe minimum one ∅t in PS f .

21 / 34

Page 39: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Primitive digraphsFor A = (A1, . . . , At) ∈ (2K)t and B = (B1, . . . , Bt) ∈ (2K)t, we writeA � B provided Ai ⊆ Bi for i = 1, . . . , t. This makes (2K)t a posetwith the maximum element (K, . . . ,K)︸ ︷︷ ︸

t

= Kt and the minimum

element ∅t. An element from (2K)t which has ∅ as one of its tcomponents is called a vacant element.

Let f be a t-variable digraph on K. It is clear that every vacantelement of (2K)t will reach ∅t in PS f eventually.

We call f primitive provided every element from Kt will reach Kt

in PS f , equivalently, every non-vacant element will go to thelargest element Kt in PS f and every vacant element will go tothe minimum one ∅t in PS f .

21 / 34

Page 40: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Primitive digraphsFor A = (A1, . . . , At) ∈ (2K)t and B = (B1, . . . , Bt) ∈ (2K)t, we writeA � B provided Ai ⊆ Bi for i = 1, . . . , t. This makes (2K)t a posetwith the maximum element (K, . . . ,K)︸ ︷︷ ︸

t

= Kt and the minimum

element ∅t. An element from (2K)t which has ∅ as one of its tcomponents is called a vacant element.

Let f be a t-variable digraph on K. It is clear that every vacantelement of (2K)t will reach ∅t in PS f eventually.

We call f primitive provided every element from Kt will reach Kt

in PS f , equivalently, every non-vacant element will go to thelargest element Kt in PS f and every vacant element will go tothe minimum one ∅t in PS f .

21 / 34

Page 41: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

A primitive 3-variable digraph

KKK

1KK

2KK12K

K2K1K2

KK2

21K

K1K

11KK11

211

1K1

2K122KK22

221

121

222122

112111

212

K21

K12

2K2

KK1

112

111

211

121

212

221

222

122

Figure: The De Bruijn form and part of the phase space induced bynon-vacant vertices of a 3-variable digraph on K = [2]. This digraph isnot primitive in the sense of Chang-Pearson-Zhang (SIMAX, 2011). 22 / 34

Page 42: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Hitting indices

Let f be a t-variable digraph on K. For a, b ∈ Kt, we defineHI f (a, b) to be the set

{n > 0 : b � Mnf (a) ∈ (2K)t} = {n > 0 : b ∈ Mn

f (a) ∈ (2K)t}

and call it the set of hitting indices of f from a to b.

For any a, b ∈ Kt with a , b, the distance from a to b in f isdefined to be Dist f (a, b) = minHI f (a, b). The diameter of f ,denoted by Dia( f ), is maxa,b Dist f (a, b).

23 / 34

Page 43: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Hitting indices

Let f be a t-variable digraph on K. For a, b ∈ Kt, we defineHI f (a, b) to be the set

{n > 0 : b � Mnf (a) ∈ (2K)t} = {n > 0 : b ∈ Mn

f (a) ∈ (2K)t}

and call it the set of hitting indices of f from a to b.

For any a, b ∈ Kt with a , b, the distance from a to b in f isdefined to be Dist f (a, b) = minHI f (a, b).

The diameter of f ,denoted by Dia( f ), is maxa,b Dist f (a, b).

23 / 34

Page 44: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Hitting indices

Let f be a t-variable digraph on K. For a, b ∈ Kt, we defineHI f (a, b) to be the set

{n > 0 : b � Mnf (a) ∈ (2K)t} = {n > 0 : b ∈ Mn

f (a) ∈ (2K)t}

and call it the set of hitting indices of f from a to b.

For any a, b ∈ Kt with a , b, the distance from a to b in f isdefined to be Dist f (a, b) = minHI f (a, b). The diameter of f ,denoted by Dia( f ), is maxa,b Dist f (a, b).

23 / 34

Page 45: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Strongly connected digraphs

Let f be a t-variable digraph on K.

We call f strongly connected if HI f (a, b) , ∅ for all a, b ∈ Kt. Iff is strongly connected and |K| > 1, then Dia( f ) < ∞.

24 / 34

Page 46: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Strongly connected digraphs

Let f be a t-variable digraph on K.

We call f strongly connected if HI f (a, b) , ∅ for all a, b ∈ Kt.

Iff is strongly connected and |K| > 1, then Dia( f ) < ∞.

24 / 34

Page 47: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Strongly connected digraphs

Let f be a t-variable digraph on K.

We call f strongly connected if HI f (a, b) , ∅ for all a, b ∈ Kt. Iff is strongly connected and |K| > 1, then Dia( f ) < ∞.

24 / 34

Page 48: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Periodic classes

Let f be a strongly connected t-variable digraph on K.

Lemma (W., Xu, Zhu)For all a, b ∈ Kt,

gcd(HI f (a, a)) = gcd(HI f (b, b)).

We define the period of f to be per( f ) = gcd(HI f (a, a)), a ∈ Kt.

We say a, b ∈ Kt are equivalent for f provided gcd(HI f (a, b)) isa multiple of per( f ). Each such equivalent class for f is called aperiodic class of f .

25 / 34

Page 49: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Periodic classes

Let f be a strongly connected t-variable digraph on K.

Lemma (W., Xu, Zhu)For all a, b ∈ Kt,

gcd(HI f (a, a)) = gcd(HI f (b, b)).

We define the period of f to be per( f ) = gcd(HI f (a, a)), a ∈ Kt.

We say a, b ∈ Kt are equivalent for f provided gcd(HI f (a, b)) isa multiple of per( f ). Each such equivalent class for f is called aperiodic class of f .

25 / 34

Page 50: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Periodic classes

Let f be a strongly connected t-variable digraph on K.

Lemma (W., Xu, Zhu)For all a, b ∈ Kt,

gcd(HI f (a, a)) = gcd(HI f (b, b)).

We define the period of f to be per( f ) = gcd(HI f (a, a)), a ∈ Kt.

We say a, b ∈ Kt are equivalent for f provided gcd(HI f (a, b)) isa multiple of per( f ). Each such equivalent class for f is called aperiodic class of f .

25 / 34

Page 51: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Cyclic decomposition into periodicclasses

Theorem (W., Xu, Zhu)Let f be a strongly connected t-variable digraph on K withper( f ) = p. Then the periodic classes of f can be enumeratedas C1, . . . ,Cp such that the following hold:

(i) (C1, . . . ,Cp) is a longest cycle in PS f ;

(ii) gcd(HI(a, b)) ≡ j − i (mod p) for a ∈ Ci and b ∈ C j;

(iii) there are p nonempty subsets A1, . . . , Ap of K such thatCi = Ai × Ai+1 × · · · × Ai+t−1 ⊆ Kt, i ∈ [p], where thesubscripts are calculated modulo p.

A1

A2

A3

A4

A1 × A2 = C1A2 × A3 = C2

A3 × A4 = C3 A4 × A1 = C4

Figure: Periodic classes of a 2-variable strongly connected digraph.

26 / 34

Page 52: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Cyclic decomposition and box partition

Let (V, E) be a hypergraph, namely E is a subset of the powerset of V.

We use τ(V, E) for the covering number of (V, E), that is theminimum size |A| of a subset A of E such that the union of theelements in A is V, and we say τ(V, E) is +∞ if such a coveringA does not exist.

We write π(V, E) for the partition number of (V, E), that is theminimum size |A| of a subset A of E such that the disjoint unionof the elements in A is V, and we say π(V, E) is +∞ if such apartition A does not exist. Surely, π(V, E) is no smaller thanτ(V, E).

27 / 34

Page 53: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Box partition and periodLet V1, . . . ,Vs be s sets. An s-box of V1 × · · · × Vs is a setW = W1 × · · · ×Ws where Wi is a subset of Vi for i = 1, . . . , s. Let(V, E) be a box-hypergraph, where V = V1 × · · · × Vs and eachelement of E is an s-box of V1 × · · · × Vs. Let Ei = {Wi : W ∈ E}be the projections of E to the ith set Vi.

Theorem (Alon, Bohman, Holzman, Kleitman 1)If τ(Vi, Ei) ≥ 2 for all i, then π(V, E) ≥ 2s.

QuestionIs it true that π(V, E) ≥

∏si=1 τ(Vi, Ei)?

Theorem (Qian, W., Zhu)Let Φ be a cyclic decomposition of (Kt,K, t) with period p,where t ≥ 2. Then, p ≥ 2d

√te.

1N. Alon, T. Bohman, R. Holzman, D. J. Kleitman, On partitions of discreteboxes, Discrete Math. 257 (2002), 255–258.

28 / 34

Page 54: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Set of periodsLet Pt,k be the set of positive integers which can be the periodof a t-variable strongly connected digraph on [k] and letPt = ∪k>0Pt,k.

It is known that Pt ⊇ {1, 2t, 3t, . . .}. It is also known thatP2,4 = {1, 4, 8, 9, 10, 11, 12, 14, 16}.

aa ab

acbccc

cd

cbdb

babb

adbd dc

dd

cada

aa ab ba bb

acbc

cccddd

da

ad

dbbd dc

cacb

Figure: The De Bruijn forms of 2-variable digraphs with peroids 8 and11.

29 / 34

Page 55: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Set of periodsLet Pt,k be the set of positive integers which can be the periodof a t-variable strongly connected digraph on [k] and letPt = ∪k>0Pt,k.

It is known that Pt ⊇ {1, 2t, 3t, . . .}.

It is also known thatP2,4 = {1, 4, 8, 9, 10, 11, 12, 14, 16}.

aa ab

acbccc

cd

cbdb

babb

adbd dc

dd

cada

aa ab ba bb

acbc

cccddd

da

ad

dbbd dc

cacb

Figure: The De Bruijn forms of 2-variable digraphs with peroids 8 and11.

29 / 34

Page 56: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Set of periodsLet Pt,k be the set of positive integers which can be the periodof a t-variable strongly connected digraph on [k] and letPt = ∪k>0Pt,k.

It is known that Pt ⊇ {1, 2t, 3t, . . .}. It is also known thatP2,4 = {1, 4, 8, 9, 10, 11, 12, 14, 16}.

aa ab

acbccc

cd

cbdb

babb

adbd dc

dd

cada

aa ab ba bb

acbc

cccddd

da

ad

dbbd dc

cacb

Figure: The De Bruijn forms of 2-variable digraphs with peroids 8 and11.

29 / 34

Page 57: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Set of periods, Cont’d

Theorem (W., Xu, Zhu)

• For any positive integer t, the set N \Pt is finite.• N \P1 = ∅ ( N \P2 = {2, 3, 5, 6, 7} ( N \P3.

• ∩i>0Pi = {1}.

QuestionIs it true that P1 ) P2 ) P3 ) P4 ) · · · ?

30 / 34

Page 58: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Set of periods, Cont’d

Theorem (W., Xu, Zhu)

• For any positive integer t, the set N \Pt is finite.• N \P1 = ∅ ( N \P2 = {2, 3, 5, 6, 7} ( N \P3.

• ∩i>0Pi = {1}.

QuestionIs it true that P1 ) P2 ) P3 ) P4 ) · · · ?

30 / 34

Page 59: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Primitive digraph and primitive exponent

Theorem (W., Xu, Zhu)Let f be a multivariable digraph. Then f is primitive if and onlyif f is strongly connected and has period one.

For a primitive t-variable digraph f on K, we define

g( f ) = min{i > 0 : Mif (A) = Kt for all A ∈ Kt}

and call it the primitive exponent of f .

31 / 34

Page 60: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Primitive digraph and primitive exponent

Theorem (W., Xu, Zhu)Let f be a multivariable digraph. Then f is primitive if and onlyif f is strongly connected and has period one.

For a primitive t-variable digraph f on K, we define

g( f ) = min{i > 0 : Mif (A) = Kt for all A ∈ Kt}

and call it the primitive exponent of f .

31 / 34

Page 61: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

Maximum primitive exponent

Let Dt,k be the set of primitive t-variable digraphs on [k] and let

γ(t, k) := maxf∈Dt,k

g( f ).

Proposition

• Wielandt’s bound (1959): γ(1, k) = (k − 1)2 + 1;• γ(2, 1) = 1, γ(2, 2) = 7, γ(2, 3) = 23 = (2 × 3 − 1)2 − 2;• γ(2, k) ≥ (2k − 1)2 + 1 when k ≥ 4;• kt ≤ γ(t, k) for any positive integer t.

Let rk be the minimum number of multiplications to multiply two k by kmatrices (rank of the k by k matrix multiplication tensor). Bycoincidence (?), r1 = 1, r2 = 7, r3 ≤ 23. Can we expect rk ≤ γ(2, k)?

32 / 34

Page 62: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

A generalization of Wielandt’s bound

Conjecture (W., Xu, Zhu)γ(2, k) = (2k − 1)2 + 1 when k ≥ 4.

Theorem (Z. Chen, W.)For any positive integers t and k,

kt ≤ γ(t, k) ≤ t(k − 1)(kt − 1) + 1.

When (t, k) = (2, 2) and when t = 1, the second inequality holdsas an equality.

33 / 34

Page 63: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

A generalization of Wielandt’s bound

Conjecture (W., Xu, Zhu)γ(2, k) = (2k − 1)2 + 1 when k ≥ 4.

Theorem (Z. Chen, W.)For any positive integers t and k,

kt ≤ γ(t, k) ≤ t(k − 1)(kt − 1) + 1.

When (t, k) = (2, 2) and when t = 1, the second inequality holdsas an equality.

33 / 34

Page 64: From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28  · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

THANK YOU!

Figure: http://www.staff.science.uu.nl/~looij101/Coxeter_groups_illustrated.html#3

34 / 34