From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28 · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood

From Markov chains to Multivariabledigraphs

Yaokun Wu (~�n)Shanghai Jiao Tong University

G2M2Yekaterinburg, July 28, 2017

Markov chains Weighted multivariable digraphs Multivariable digraphs with Boolean coefficients

This talk is mainly based on:

• Yaokun Wu, Zeying Xu (ML¤), Yinfeng Zhu (6Û¸), Anexpansion property of Boolean linear maps, The ElectronicJournal of Linear Algebra, 31 (2016) 381–407.

• Yaokun Wu, Zeying Xu, Yinfeng Zhu, Strongly connectedmultivariate digraphs, The Electronic Journal ofCombinatorics, 24(1) (2017) #P1.47, 44 pp.

• Zongchen Chen (�m�), Yaokun Wu, An expansionproperty of Boolean multilinear maps, preprint.

• Chengyang Qian (a§�), Yaokun Wu, Yinfeng Zhu, Somenonexistence results for cyclic decompositions, preprint.

• ...

2 / 34


Markov chain

Figure: Andrey Markov,14 June 1856 – 20 July1922 https://en.wikipedia.org/wiki/Andrey_Markov

A Markov chain is a sequence ofpossible events in which theprobability of every event relieslinearly on the probability of thestates attained in the previousevent.

More generally, for any positiveinteger t, an order-t Markov chainis a sequence of possible events inwhich the probability of each eventdepends multi-linearly on theprobability of the states attained inthe previous t events.

3 / 34

https://en.wikipedia.org/wiki/Andrey_Markov




Markov chain, Contd.

Let E = (E1, E2, E3, . . .) be a sequence of random variableswhich take values in a finite state space X. We call E an order-tMarkov chain if it holds

Pr(Em = x : Em−1 = xt, Em−2 = xt−1, . . . , Em−t = x1)

, Ax,xt ,xt−1,...,x1 ,

for all m ≥ t + 1.

For the (t + 1)-dimensional array (tensor) A to represent aprobability transition, we surely have A ≥ 0 and∑

x∈XAx,xt ,xt−1,...,x1 = 1 for all xt, xt−1, . . . , x1 ∈ X.

4 / 34


Markov chain, again

• A finite set: X

• The Kronecker delta function at x ∈ X: δx ∈ RX

• The probability simplex on X:∆X = {p ∈ RX : p ≥ 0,

∑x∈X p(x) = 1}, namely the convex hull

of {δx : x ∈ X}.

• An order-t Markov chain on X: a multilinear mapf : RX × · · · × RX︸︷︷︸

t

→ RX such that f maps ∆X × · · · × ∆X︸︷︷︸t

into

∆X.

5 / 34


Markov chain, again, Contd.

• Markov operator for f : the map M f which sends(p(1) ⊗ p(2) ⊗ · · · ⊗ p(t)) ∈ RX × · · · × RX︸︷︷︸

t

to

p(2) ⊗ p(3) ⊗ . . . ⊗ p(t) ⊗ f (p(1), p(2), . . . , p(t))

• To understand the higher order Markov chain f is to studythe dynamical behaviours of the t-linear map M f from∆X × · · · × ∆X︸︷︷︸

t

to itself.

• The adjacency tensor for f : A( f ) ∈ RX×···×X as specified byf (δx1 , . . . , δxt ) =

∑xt+1∈XA( f )xt+1,...,x1δxt+1

• The De Bruijn form of f : the weighted digraph Γ f withvertex set Xt and the set of arcs x1 · · · xt → x2 · · · xt+1 withweight A( f )xt+1,...,x1 .

6 / 34


JOURNAL OF ALGEBRAIC STATISTICSVol. 3, No. 1, 2012, 1-10ISSN 1309-3452 – www.jalgstat.com

Geometry of Higher-Order Markov Chains

Bernd Sturmfels

Department of Mathematics, University of California at Berkeley, Berkeley, CA 94720, USA

Abstract. We determine an explicit Grobner basis, consisting of linear forms and determinantalquadrics, for the prime ideal of Raftery’s mixture transition distribution model for Markov chains.When the states are binary, the corresponding projective variety is a linear space, the model itselfconsists of two simplices in a cross-polytope, and the likelihood function typically has two localmaxima. In the general non-binary case, the model corresponds to a cone over a Segre variety.

2000 Mathematics Subject Classifications: Primary 60J10, 13P25; Secondary 62H05

Key Words and Phrases: Markov chains, Grobner bases, Maximum likelihood

1. Introduction

In this note we investigate Adrian Raftery’s mixture transition distribution model(MTD) from the perspective of algebraic statistics [4, 8]. The MTD model, which wasfirst proposed in [9], has a wide range of applications in engineering and the sciences [10].The article by Berchtold and Raftery [2] offers a detailed introduction and review.

The point of departure for this project was a conjecture due to Donald Richards [11],stating that the likelihood function of an MTD model can have multiple local maxima.We establish this conjecture for the case of binary states in Proposition 1.

Our main result, to be derived in Section 4, gives an explicit Grobner basis for theMTD model. Here, both the sequence length and the number of states are arbitrary.

We begin with an algebraic description of the model in [2, 9]. Fix a pair of positiveintegers l and m, and set N = ml+1 − 1. We define the statistical model MTDl,m whosestate space is the set [m]l+1 of sequences i0i1 · · · il of length l+ 1 over the alphabet [m] ={1, 2, . . . ,m}. The model has (m − 1)m + l − 1 parameters, given by the entries of anm ×m-transition matrix (qij) and a probability distribution λ = (λ1, . . . , λl) on the set[l] = {1, 2, . . . , l} of the hidden states. Thus the parameter space is the product of simplices(∆m−1)m ×∆l−1. The model MTDl,m will be a semialgebraic subset of the simplex ∆N .That simplex has its coordinates pi0i1···il indexed by sequences in [m]l+1.

The model MTDl,m is the image of the bilinear map

φl,m : (∆m−1)m ×∆l−1 → ∆N

Email addresses: [email protected] (B. Sturmfels)

http://www.jalgstat.com/ 1 c© 2012 JAlgStat All rights reserved.

For the order-t Markov chain f , M f is a linear map when t = 1,while M f is in general not any linear map when t > 1!

When t = 1, the dynamical behaviour of M f is quite clear; Whent > 1, the dynamical behaviour of M f is far from understood.

7 / 34


A simple sample question

QuestionLet f be an order-t Markov chainon a finite set X and letΨ ∈ ∆X × · · · × ∆X︸︷︷︸

t

. We call

Φ ∈ ∆X × · · · × ∆X︸︷︷︸t

a history of Ψ in f

provided there exists a positiveinteger n such that Mn

f (Φ) = Ψ.

Is it true that Ψ can only have afinite number of histories?

Figure: http://simplecoffee.ru/ourstory.html

8 / 34

http://simplecoffee.ru/ourstory.html




A simple sample question, Contd.

When t = 1, the question has a positive answer: Every Markovchain must have a beginning (Loo-Keng Hua (uÛ�), 1984).This also explains that there only exist one-sided Markovchains.

This is claimed to be a result in the folklore (Kai Lai Chung (¨m4), Chance & Choice, World Scientific, 2004).

9 / 34


A simple sample question, Contd.

When t = 1, the question has a positive answer: Every Markovchain must have a beginning (Loo-Keng Hua (uÛ�), 1984).This also explains that there only exist one-sided Markovchains.

This is claimed to be a result in the folklore (Kai Lai Chung (¨m4), Chance & Choice, World Scientific, 2004).

9 / 34


Inhomogeneous Markov chainLet f and g be two multilinear maps from RX × · · · × RX︸︷︷︸

t

to RX.

M f Mg is surely still a t-multilinear map:

(M f ◦Mg)(δx1 , . . . , δxt )

= M f(δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)=

(δx3 , . . . , δxt , g(δx1 , . . . , δxt ), f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)The product of A( f ) and A(g) is defined to correspond to thet-linear map which sends (δx1 , . . . , δxt ) to(δx2 , . . . , δxt−1 , f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

).(

A( f )A(g))

xt+2,xt ,...,x1�

∑xt+1∈XA( f )xt+2,xt+1,...,x2A(g)xt+1,xt ,...,x1

Flow of time in the inhomogeneous higher order Markov chaindefines the product of tensors!

10 / 34



t

to RX.


(M f ◦Mg)(δx1 , . . . , δxt )

= M f(δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)=


)

The product of A( f ) and A(g) is defined to correspond to thet-linear map which sends (δx1 , . . . , δxt ) to(δx2 , . . . , δxt−1 , f (δx2 , . . . , δxt , g(δx1 , . . . , δxt )

).(

A( f )A(g))

xt+2,xt ,...,x1�



10 / 34



t

to RX.


(M f ◦Mg)(δx1 , . . . , δxt )

= M f(δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)=



).(

A( f )A(g))

xt+2,xt ,...,x1�



10 / 34



t

to RX.


(M f ◦Mg)(δx1 , . . . , δxt )

= M f(δx2 , . . . , δxt , g(δx1 , . . . , δxt )

)=



).(

A( f )A(g))

xt+2,xt ,...,x1�



10 / 34


Weighted multivariable digraph and itsadjacency tensor

Let R be any semiring, say Boolean semiring, tropical semiring,binary field, complex field, ....

An R-weighted t-variable digraph on a set K is a map fromt︷︸︸︷

K × · · · × K to RK .

When R is the Boolean semiring (thus RK can be identified withthe power set 2K) and t = 1, an R-weighted t-variable digraph isa usual digraph.

Each R-weighted t-variable digraph f on K corresponds to its

adjacency tensor A( f ) ∈ R

t+1︷︸︸︷K × · · · × K :

f (k1, . . . , kt) =∑

kt+1∈K

A( f )kt+1,...,k1δkt+1

11 / 34





K × · · · × K to RK .




t+1︷︸︸︷K × · · · × K :

f (k1, . . . , kt) =∑

kt+1∈K

A( f )kt+1,...,k1δkt+1

11 / 34





K × · · · × K to RK .




t+1︷︸︸︷K × · · · × K :

f (k1, . . . , kt) =∑

kt+1∈K

A( f )kt+1,...,k1δkt+1

11 / 34





K × · · · × K to RK .




t+1︷︸︸︷K × · · · × K :

f (k1, . . . , kt) =∑

kt+1∈K

A( f )kt+1,...,k1δkt+1

11 / 34


Multivalued binary operations intropical geometry and hypergroup

Figure: Oleg Virohttp://www.math.stonybrook.edu/~oleg/

Multivalued binary operations:K × K to 2K \ {∅}

I Oleg Viro, On basic concepts oftropical geometry, Proceedings ofthe Steklov Institute ofMathematics 273 (2011), 252–282.

I V. M. Buchstaber, n-valuedgroups: theory and applications,Moscow Mathematical Journal 6(2006), 57–84.

12 / 34

http://www.math.stonybrook.edu/~oleg/




Tensor multiplication

For any two (t + 1)-fold R-weighted tensors B,C ∈ R

t+1︷︸︸︷K × · · · × K ,

we define the product of B and C to be the (t + 1)-foldR-weighted tensor, which is denoted by BC and sends(k1, . . . , kt+1) ∈ Kt+1 to

BC(k1, . . . , kt+1) =∑k∈K

B(k1, k, k2, . . . , kt)C(k, k2, . . . , kt+1).

This multiplication is essentially the same with the tensormultiplication defined by Grone.

I Robert Grone, Markov chains and tensor multiplications,Journal of Algebra, 109 (1987) 14–24.

13 / 34


Tensor multiplication

For any two (t + 1)-fold R-weighted tensors B,C ∈ R

t+1︷︸︸︷K × · · · × K ,

we define the product of B and C to be the (t + 1)-foldR-weighted tensor, which is denoted by BC and sends(k1, . . . , kt+1) ∈ Kt+1 to

BC(k1, . . . , kt+1) =∑k∈K

B(k1, k, k2, . . . , kt)C(k, k2, . . . , kt+1).

This multiplication is essentially the same with the tensormultiplication defined by Grone.

I Robert Grone, Markov chains and tensor multiplications,Journal of Algebra, 109 (1987) 14–24.

13 / 34


Tensor algebra

• The set of (t + 1)-fold R-weighted tensors with the previousmultiplication rule form the tensor algebra A(t + 1,R,K).

• The map f that sends (k1, . . . , kt+1) ∈ Kt+1 to δk1,k2 is the leftidentity for the tensor multiplication in A(t + 1,R,K).

• Composition of maps here models the flow of time and soit is no surprise to find that A(t + 1,R,K) is generally not anassociative algebra.

QuestionWhat is the structure of A(t + 1,R,K)?

14 / 34


Boolean coefficient digraphs as monads

A monad is a mapping of afinite set into itself. – V. I.Arnold, The Topology ofAlgebra: Combinatorics ofSquaring, Functional Analysisand Its Applications 37 (2003)177–190.

The study of Boolean coefficientmultivariable digraphs is a study of somespecial monads.

Boris Z. Shapiro (1990)Maxim E. Kazarian (1991)Ernesto Rosales-Gonzalez (1991)Oleg G. Galkin (1992)Michael Z. Shapiro (1992)Alexander Kh. Rakhimov (1995)Francesca Aicardi (1996)Yuri V. Chekanov (1997)Emmanuel Ferrand (1997)Petr E. Pushkar (1998)Jacques-Olivier Moussafir (2000)Mauricio Garay (2001)Fabien Napolitano (2001)Ricardo Uribe-Vargas (2001)Mikhail B. Mishustin (2002)Adriana Ortiz-Rodriguez (2002)Gianmarco Capitanio (2004)Oleg N. Karpenkov (2005)Alexander M. Lukatsky (1975) (2006, DSci)

Alexander Givental

To Whom It May ConcernNo estь i Boжii sud ...

M. �. Lermontov, “Smertь Poзta”6

Posthumousmemoirs seemtohave the unintendedeffect of reducing the person’s life to a collection ofstories. For most of us it would probably be a justand welcome outcome, but for Vladimir Arnold,I think, it would not. He tried and managed totell us many different things about mathematics,education, and beyond, and in many cases we’vebeen rather slow listening or thinking, so I believewe will be returning again and again not only toour memories of him but to his own words aswell. What is found below is not a memoir, but arecommendation letter, albeit a weak one, for hedid not get the prize, and yet hopefully useful asan interim attempt to overview his mathematicalheritage.

January 25, 2005Dear Members of [the name of the committee],

You have requested my commentaries on the workof Vladimir Arnold. Writing them is an honorableand pleasurable task for me.

In the essence the task is easy:Yes, Vladimir Arnold fully merits [the name of

the prize] since his achievements are of extraordi-

nary depth and influence.His work indeed resolves fundamental prob-

lems, and introduces unifying principles, and

opens up major new areas, and (at least in some of

Alexander Givental is professor of mathematics at the

University of California, Berkeley. His email address is

[email protected], there is God’s Court, too. . ., M. Yu. Lermontov,“Death of Poet”.

At Dubna, 2006.

these areas) it introduces powerful new techniquestoo.

On the other hand, writing this letter is noteasy, mainly because the ways Arnold’s work con-tributes to our knowledge are numerous and go farbeyond my personal comprehension. As Arnold’sstudent, I am quite familiar with those aspects ofhis work which inspired my own research. Outsidethese areas, hopefully, I will be able to conveythe conventional wisdom about Arnold’s most fa-mous achievements. Yet this leaves out the oceanof numerous, possibly less famous but extremelyinfluential, contributions, of which I have onlypartial knowledge and understanding. So, I willhave to be selective here and will mention just ahandful of examples which I am better familiarwith and which for this reason may look chosenrandomly.

Perhaps the most legendary, so to speak, ofArnold’s contributions is his work on small

denominators,7 followed by the discovery ofArnold’s diffusion,8 and known now as part of theKolmogorov–Arnold–Moser theory. Among otherthings, this work contains an explanation (or, de-pending on the attitude, a proof, and a highlytechnical one) of stability of the solar planetarysystem. Even more importantly, the KAM theoryprovides a very deep insight into the real-world dy-namics (perhaps one of the few such insights so far,one more being stability of Anosov’s systems) and

7Small denominators III. Problems of stability of motionin classical and celestial mechanics, Uspekhi Mat. Nauk

18 (1963), no. 6, 91–192, following Small denominatorsI. Mappings of a circle onto itself, Izvestia AN SSSR, Ser.

Mat. 25 (1961), 21–86, Small denominators II. Proof of atheorem of A. N. Kolmogorov on the preservation of con-ditionally periodic motions under a small perturbationof the Hamiltonian, Uspekhi Mat. Nauk 18 (1963), no. 5,

13–40, and a series of announcements in DAN SSSR.8Instability of dynamical systems with many degrees offreedom, DAN SSSR 156 (1964), 9–12.

March 2012 Notices of the AMS 381

Figure: Vladimir Arnold(1937–2010) http://www.ams.org/notices/201203/rtx120300378p.pdf

15 / 34

http://www.ams.org/notices/201203/rtx120300378p.pdf





Single-variable graph theory

Let K be a set. A single-variable digraph f , also known as atopological Markov chain, on K is a map from K to 2K .

It is most convenient to identify f with the digraph Γ f havingvertex set K and arc set {k → j : j ∈ f (k)}.

The map f induces a Boolean linear map M f from 2K to 2K ,called the Markov operator associated to f , such thatM f (A) = ∪k∈A f (k) for each A ∈ 2K .

The phase space of f , denoted by PS f , has vertex set 2K andarc set {A→ M f (A) : A ∈ 2K}. The structure of PS f tells us thedynamical behavior of M f .

16 / 34







16 / 34







16 / 34







16 / 34


An example

A B

C D

D ∅ ABC ABCD

C A B CD BD

AD

BC ACD AB BCD AC

ABD

Figure: A digraph f on {A, B,C,D} and its phase space PS f .

The phase space consists of those limit cycles together withthe transient parts (in-trees attached to the limit cycles).

17 / 34


Multivariable graph theoryLet K be a set and t a positive integer. A digraph f on K in tvariables is a map from Kt to 2K .

It can be graphicallyrepresented by its De Bruijn form Γ f , a digraph with vertex setKt and arc set

{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.

By ignoring isolated vertices, the De Bruijn forms of Booleandigraphs are exactly Rauzy graphs in symbolic dynamics. Themap f induces a Boolean t-linear map M f from (2k)t to (2k)t,called the Markov operator associated to f , such that

M f (A1, . . . , At) =(A2, . . . , At,∪k1∈A1,...,kt∈At f (k1, . . . , kt)

).

The phase space of f , denoted by PS f , has vertex set (2K)t

and arc set {A→ M f (A) : A ∈ (2K)t}. The structure of PS f tellsus the dynamical behavior of M f .

18 / 34


Multivariable graph theoryLet K be a set and t a positive integer. A digraph f on K in tvariables is a map from Kt to 2K . It can be graphicallyrepresented by its De Bruijn form Γ f , a digraph with vertex setKt and arc set

{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.



).



18 / 34



{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.

By ignoring isolated vertices, the De Bruijn forms of Booleandigraphs are exactly Rauzy graphs in symbolic dynamics.

Themap f induces a Boolean t-linear map M f from (2k)t to (2k)t,called the Markov operator associated to f , such that


).



18 / 34



{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.



).



18 / 34



{(k1, . . . , kt)→ (k2, . . . , kt+1) : kt+1 ∈ f (k1, . . . , kt)}.



).



18 / 34


An example

11 12

21 22

22

∅2 2∅ ∅∅

K∅∅K

1∅ ∅1

1121 12 2K K1 1K KK

K2

Figure: The De Bruijn form Γ f and the phase space PS f of a2-variable digraph f on K = [2] = {1, 2}.

19 / 34


Wave-particle dualityOne can view Γ f as a local dynamical mechanism and PS f asthe global evolving picture. The question is to see how to linkthe local with the global.

One can also think of Γ f as the particle version of f and PS f asthe wave version of f . Both versions encode full informationabout f in some way.

http://fossbytes.com/wp-content/uploads/2015/03/light-wave-particle-duality-.jpg

20 / 34


Wave-particle dualityOne can view Γ f as a local dynamical mechanism and PS f asthe global evolving picture. The question is to see how to linkthe local with the global.

One can also think of Γ f as the particle version of f and PS f asthe wave version of f . Both versions encode full informationabout f in some way.

http://fossbytes.com/wp-content/uploads/2015/03/light-wave-particle-duality-.jpg

20 / 34


Primitive digraphsFor A = (A1, . . . , At) ∈ (2K)t and B = (B1, . . . , Bt) ∈ (2K)t, we writeA � B provided Ai ⊆ Bi for i = 1, . . . , t.

This makes (2K)t a posetwith the maximum element (K, . . . ,K)︸︷︷︸

t

= Kt and the minimum

element ∅t. An element from (2K)t which has ∅ as one of its tcomponents is called a vacant element.

Let f be a t-variable digraph on K. It is clear that every vacantelement of (2K)t will reach ∅t in PS f eventually.

We call f primitive provided every element from Kt will reach Kt

in PS f , equivalently, every non-vacant element will go to thelargest element Kt in PS f and every vacant element will go tothe minimum one ∅t in PS f .

21 / 34


Primitive digraphsFor A = (A1, . . . , At) ∈ (2K)t and B = (B1, . . . , Bt) ∈ (2K)t, we writeA � B provided Ai ⊆ Bi for i = 1, . . . , t. This makes (2K)t a posetwith the maximum element (K, . . . ,K)︸︷︷︸

t






21 / 34



t






21 / 34



t






21 / 34


A primitive 3-variable digraph

KKK

1KK

2KK12K

K2K1K2

KK2

21K

K1K

11KK11

211

1K1

2K122KK22

221

121

222122

112111

212

K21

K12

2K2

KK1

112

111

211

121

212

221

222

122

Figure: The De Bruijn form and part of the phase space induced bynon-vacant vertices of a 3-variable digraph on K = [2]. This digraph isnot primitive in the sense of Chang-Pearson-Zhang (SIMAX, 2011). 22 / 34


Hitting indices

Let f be a t-variable digraph on K. For a, b ∈ Kt, we defineHI f (a, b) to be the set

{n > 0 : b � Mnf (a) ∈ (2K)t} = {n > 0 : b ∈ Mn

f (a) ∈ (2K)t}

and call it the set of hitting indices of f from a to b.

For any a, b ∈ Kt with a , b, the distance from a to b in f isdefined to be Dist f (a, b) = minHI f (a, b). The diameter of f ,denoted by Dia( f ), is maxa,b Dist f (a, b).

23 / 34


Hitting indices


{n > 0 : b � Mnf (a) ∈ (2K)t} = {n > 0 : b ∈ Mn

f (a) ∈ (2K)t}


For any a, b ∈ Kt with a , b, the distance from a to b in f isdefined to be Dist f (a, b) = minHI f (a, b).

The diameter of f ,denoted by Dia( f ), is maxa,b Dist f (a, b).

23 / 34


Hitting indices


{n > 0 : b � Mnf (a) ∈ (2K)t} = {n > 0 : b ∈ Mn

f (a) ∈ (2K)t}


For any a, b ∈ Kt with a , b, the distance from a to b in f isdefined to be Dist f (a, b) = minHI f (a, b). The diameter of f ,denoted by Dia( f ), is maxa,b Dist f (a, b).

23 / 34


Strongly connected digraphs

Let f be a t-variable digraph on K.

We call f strongly connected if HI f (a, b) , ∅ for all a, b ∈ Kt. Iff is strongly connected and |K| > 1, then Dia( f ) < ∞.

24 / 34




We call f strongly connected if HI f (a, b) , ∅ for all a, b ∈ Kt.

Iff is strongly connected and |K| > 1, then Dia( f ) < ∞.

24 / 34




We call f strongly connected if HI f (a, b) , ∅ for all a, b ∈ Kt. Iff is strongly connected and |K| > 1, then Dia( f ) < ∞.

24 / 34


Periodic classes

Let f be a strongly connected t-variable digraph on K.

Lemma (W., Xu, Zhu)For all a, b ∈ Kt,

gcd(HI f (a, a)) = gcd(HI f (b, b)).

We define the period of f to be per( f ) = gcd(HI f (a, a)), a ∈ Kt.

We say a, b ∈ Kt are equivalent for f provided gcd(HI f (a, b)) isa multiple of per( f ). Each such equivalent class for f is called aperiodic class of f .

25 / 34


Periodic classes






25 / 34


Periodic classes






25 / 34


Cyclic decomposition into periodicclasses

Theorem (W., Xu, Zhu)Let f be a strongly connected t-variable digraph on K withper( f ) = p. Then the periodic classes of f can be enumeratedas C1, . . . ,Cp such that the following hold:

(i) (C1, . . . ,Cp) is a longest cycle in PS f ;

(ii) gcd(HI(a, b)) ≡ j − i (mod p) for a ∈ Ci and b ∈ C j;

(iii) there are p nonempty subsets A1, . . . , Ap of K such thatCi = Ai × Ai+1 × · · · × Ai+t−1 ⊆ Kt, i ∈ [p], where thesubscripts are calculated modulo p.

A1

A2

A3

A4

A1 × A2 = C1A2 × A3 = C2

A3 × A4 = C3 A4 × A1 = C4

Figure: Periodic classes of a 2-variable strongly connected digraph.

26 / 34


Cyclic decomposition and box partition

Let (V, E) be a hypergraph, namely E is a subset of the powerset of V.

We use τ(V, E) for the covering number of (V, E), that is theminimum size |A| of a subset A of E such that the union of theelements in A is V, and we say τ(V, E) is +∞ if such a coveringA does not exist.

We write π(V, E) for the partition number of (V, E), that is theminimum size |A| of a subset A of E such that the disjoint unionof the elements in A is V, and we say π(V, E) is +∞ if such apartition A does not exist. Surely, π(V, E) is no smaller thanτ(V, E).

27 / 34


Box partition and periodLet V1, . . . ,Vs be s sets. An s-box of V1 × · · · × Vs is a setW = W1 × · · · ×Ws where Wi is a subset of Vi for i = 1, . . . , s. Let(V, E) be a box-hypergraph, where V = V1 × · · · × Vs and eachelement of E is an s-box of V1 × · · · × Vs. Let Ei = {Wi : W ∈ E}be the projections of E to the ith set Vi.

Theorem (Alon, Bohman, Holzman, Kleitman 1)If τ(Vi, Ei) ≥ 2 for all i, then π(V, E) ≥ 2s.

QuestionIs it true that π(V, E) ≥

∏si=1 τ(Vi, Ei)?

Theorem (Qian, W., Zhu)Let Φ be a cyclic decomposition of (Kt,K, t) with period p,where t ≥ 2. Then, p ≥ 2d

√te.

1N. Alon, T. Bohman, R. Holzman, D. J. Kleitman, On partitions of discreteboxes, Discrete Math. 257 (2002), 255–258.

28 / 34


Set of periodsLet Pt,k be the set of positive integers which can be the periodof a t-variable strongly connected digraph on [k] and letPt = ∪k>0Pt,k.

It is known that Pt ⊇ {1, 2t, 3t, . . .}. It is also known thatP2,4 = {1, 4, 8, 9, 10, 11, 12, 14, 16}.

aa ab

acbccc

cd

cbdb

babb

adbd dc

dd

cada

aa ab ba bb

acbc

cccddd

da

ad

dbbd dc

cacb

Figure: The De Bruijn forms of 2-variable digraphs with peroids 8 and11.

29 / 34



It is known that Pt ⊇ {1, 2t, 3t, . . .}.

It is also known thatP2,4 = {1, 4, 8, 9, 10, 11, 12, 14, 16}.

aa ab

acbccc

cd

cbdb

babb

adbd dc

dd

cada

aa ab ba bb

acbc

cccddd

da

ad

dbbd dc

cacb


29 / 34



It is known that Pt ⊇ {1, 2t, 3t, . . .}. It is also known thatP2,4 = {1, 4, 8, 9, 10, 11, 12, 14, 16}.

aa ab

acbccc

cd

cbdb

babb

adbd dc

dd

cada

aa ab ba bb

acbc

cccddd

da

ad

dbbd dc

cacb


29 / 34


Set of periods, Cont’d

Theorem (W., Xu, Zhu)

• For any positive integer t, the set N \Pt is finite.• N \P1 = ∅ ( N \P2 = {2, 3, 5, 6, 7} ( N \P3.

• ∩i>0Pi = {1}.

QuestionIs it true that P1 ) P2 ) P3 ) P4 ) · · · ?

30 / 34


Set of periods, Cont’d

Theorem (W., Xu, Zhu)

• For any positive integer t, the set N \Pt is finite.• N \P1 = ∅ ( N \P2 = {2, 3, 5, 6, 7} ( N \P3.

• ∩i>0Pi = {1}.

QuestionIs it true that P1 ) P2 ) P3 ) P4 ) · · · ?

30 / 34


Primitive digraph and primitive exponent

Theorem (W., Xu, Zhu)Let f be a multivariable digraph. Then f is primitive if and onlyif f is strongly connected and has period one.

For a primitive t-variable digraph f on K, we define

g( f ) = min{i > 0 : Mif (A) = Kt for all A ∈ Kt}

and call it the primitive exponent of f .

31 / 34


Primitive digraph and primitive exponent

Theorem (W., Xu, Zhu)Let f be a multivariable digraph. Then f is primitive if and onlyif f is strongly connected and has period one.

For a primitive t-variable digraph f on K, we define

g( f ) = min{i > 0 : Mif (A) = Kt for all A ∈ Kt}

and call it the primitive exponent of f .

31 / 34


Maximum primitive exponent

Let Dt,k be the set of primitive t-variable digraphs on [k] and let

γ(t, k) := maxf∈Dt,k

g( f ).

Proposition

• Wielandt’s bound (1959): γ(1, k) = (k − 1)2 + 1;• γ(2, 1) = 1, γ(2, 2) = 7, γ(2, 3) = 23 = (2 × 3 − 1)2 − 2;• γ(2, k) ≥ (2k − 1)2 + 1 when k ≥ 4;• kt ≤ γ(t, k) for any positive integer t.

Let rk be the minimum number of multiplications to multiply two k by kmatrices (rank of the k by k matrix multiplication tensor). Bycoincidence (?), r1 = 1, r2 = 7, r3 ≤ 23. Can we expect rk ≤ γ(2, k)?

32 / 34


A generalization of Wielandt’s bound

Conjecture (W., Xu, Zhu)γ(2, k) = (2k − 1)2 + 1 when k ≥ 4.

Theorem (Z. Chen, W.)For any positive integers t and k,

kt ≤ γ(t, k) ≤ t(k − 1)(kt − 1) + 1.

When (t, k) = (2, 2) and when t = 1, the second inequality holdsas an equality.

33 / 34


A generalization of Wielandt’s bound

Conjecture (W., Xu, Zhu)γ(2, k) = (2k − 1)2 + 1 when k ≥ 4.

Theorem (Z. Chen, W.)For any positive integers t and k,

kt ≤ γ(t, k) ≤ t(k − 1)(kt − 1) + 1.

When (t, k) = (2, 2) and when t = 1, the second inequality holdsas an equality.

33 / 34


THANK YOU!

Figure: http://www.staff.science.uu.nl/~looij101/Coxeter_groups_illustrated.html#3

34 / 34

http://www.staff.science.uu.nl/~looij101/Coxeter_groups_illustrated.html#3

http://www.staff.science.uu.nl/~looij101/Coxeter_groups_illustrated.html#3

Documents

From Markov chains to Multivariable digraphsmath.sjtu.edu.cn/faculty/ykwu/data/Talk/20170728.pdf2017/07/28 · Key Words and Phrases : Markov chains, Gr obner bases, Maximum likelihood