Mémoire HABILITATION À DIRIGER DES RECHERCHES

HDR

Mémoire

rédigé en vue de l’obtention du diplôme de

HABILITATION À DIRIGER DES RECHERCHES

Spécialité : Mathématiques appliquées

délivré par

Université de Limoges (France)

École doctorale no 610 : Sciences et Ingénierie des Systèmes, Mathématiques, Informatique (SISMI)

Contributions to optimal control theory with fractional and timescale calculi, and to variational analysis in view of shape optimization

problems in contact mechanics

présenté et soutenu publiquement par

Loïc Bourdin

le lundi 5 octobre 2020

Composition du jury :

Samir Adly Université de Limoges (France) Chargé de suiviJérôme Bolte Université Toulouse Capitole (France) ExaminateurJean-Baptiste Caillau Université Côte d’Azur (France) ExaminateurMarc Quincampoix Université de Bretagne Occidentale (France) RapporteurTyrrell Rockafellar Université de Washington (Seattle, USA) ExaminateurEmmanuel Trélat Sorbonne Université (Paris 6, France) ExaminateurVladimir Veliov Université de Vienne (Autriche) RapporteurRichard Vinter Imperial College de Londres (Royaume-Uni) Rapporteur

ii

Prelude

The present document has been written in view of obtaining the french diploma “Habilitation àDiriger des Recherches (HDR)" in applied mathematics. The aim of this manuscript is to summa-rize my contributions in my research fields. For this purpose, I have grouped my works in threesub-categories:

Part I: Contributions to fractional optimal control theory;

Part II: Contributions to optimal sampled-data control theory on time scales;

Part III: Contributions to variational analysis in view of shape optimization problems in con-tact mechanics.

Each of these three parts, divided into several chapters, aims at presenting in a succinct way themain results obtained in my research papers. The proofs are omitted but they are all commented inorder to emphasize the difficulties encountered and the main ideas developped in order to over-come them. Several perspectives for further research works are also discussed throughout thedocument.

Each of the above three parts has its own bibliography that can be found at the end of the partin yellow pages. Note that my papers are referred to as [B01] to [B29], and that the papers fromthe general bibliography are referred to as [1] to [295]. My works are all accessible online on mypersonal webpage

www.unilim.fr/pages_perso/loic.bourdin/

Click on the rubric “HDR" in the left menu and then fill the password “hdr2020". My curriculumvitae can also be found in the same rubric.

The notations used in my works may vary from one paper to another. For the sake of providing aself-contained dissertation, I took care to harmonize all the notations all along the text. Thereforeplease note that the notations used in the present manuscript may differ from the ones used in thereferences [B01] to [B29].

iii

www.unilim.fr/pages_perso/loic.bourdin/

iv

Contents

I Contributions to fractional optimal control theory 1

1 Introduction to fractional calculus of variations problems 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Functional framework and basics from fractional calculus . . . . . . . . . . . . . . . . 41.3 Study of a fractional calculus of variations problem . . . . . . . . . . . . . . . . . . . . 81.4 Related works and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 PMP for Caputo fractional optimal control problems 152.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2 Preliminaries on RL and Caputo fractional Cauchy problems . . . . . . . . . . . . . . 192.3 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.4 Applications to fractional calculus of variations . . . . . . . . . . . . . . . . . . . . . . 30

II Contributions to optimal sampled-data control theory on time scales

3 PMP for state constrained optimal sampled-data control problems on time scales andbouncing trajectory phenomenon 333.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2 Basics on time scale theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.3 Main result and comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.4 The observation of a bouncing trajectory phenomenon . . . . . . . . . . . . . . . . . 513.5 Application to min-max optimal sampled-data control problems . . . . . . . . . . . . 573.6 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4 Optimal sampled-data control problems with free sampling times and application to func-tional electrical stimulations in medicine 614.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624.2 Optimal sampled-data control problems with free sampling times . . . . . . . . . . . 634.3 Application to optimal muscular force response to functional electrical stimulations 70

5 Convergence results and unified Riccati theory for LQ optimal permanent and sampled-data control problems 775.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.2 A unified Riccati theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.3 Main results of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6 Convergence in nonlinear optimal sampled-data control problems with fixed endpoint 876.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.2 Framework and preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896.3 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

v

CONTENTS

III Contributions to variational analysis in view of shape optimization problems incontact mechanics

7 Flip procedure in geometric approximation of multiple-component shapes 977.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987.2 Intersecting control polygons detection and flip procedure . . . . . . . . . . . . . . . 997.3 Application to multiple-inclusion detection . . . . . . . . . . . . . . . . . . . . . . . . 1027.4 Concluding comments and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . 106

8 A prelude to Chapters 9, 10 and 11 109

9 On a decomposition formula for the resolvent operator of the sum of two set-valued mapswith monotonicity assumptions 1139.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1149.2 Main results in [B28] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1159.3 Basic application in elliptic PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1199.4 Some comments on the earlier work [B27] . . . . . . . . . . . . . . . . . . . . . . . . . 121

10 The derivative of a parameterized mechanical contact problem with a Tresca friction lawinvolves Signorini unilateral conditions 12310.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12410.2 Basics on Mosco epi-convergence and twice epi-differentiability . . . . . . . . . . . . 12610.3 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12810.4 Illustration with some numerical simulations . . . . . . . . . . . . . . . . . . . . . . . 13110.5 Concluding remarks and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

11 Sensitivity analysis of variational inequalities via twice epi-differentiability and proto-differentiability of the proximal operator 13511.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13611.2 Objective of the paper [B26] and preliminaries . . . . . . . . . . . . . . . . . . . . . . . 13811.3 Main contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14111.4 Applications and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

vi

Part I

Contributions to fractional optimalcontrol theory

1

Chapter 1

Introduction to fractional calculus ofvariations problems

Contents1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Functional framework and basics from fractional calculus . . . . . . . . . . . . . 4

1.2.1 RL and Caputo fractional operators (left operators) . . . . . . . . . . . . . . . 5

1.2.2 RL and Caputo fractional operators (right operators) . . . . . . . . . . . . . . 7

1.2.3 Some useful properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Study of a fractional calculus of variations problem . . . . . . . . . . . . . . . . . 8

1.3.1 A fractional Tonelli-type theorem . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.2 Quasi-polynomial growths for integrability and coercivity . . . . . . . . . . . 9

1.3.3 First-order necessary optimality condition of Euler–Lagrange type . . . . . . 10

1.3.4 Some illustrative examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.4 Related works and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

The present chapter summarizes the contributions of the five following references:

• [B03]: L. Bourdin. Existence of a weak solution for fractional Euler–Lagrange equations. J.Math. Anal. Appl., 399(1):239–251, 2013.

• [B04]: L. Bourdin, T. Odzijewicz, and D. Torres. Existence of minimizers for fractional varia-tional problems containing Caputo derivatives. Adv. Dyn. Syst. Appl., 8(1):3–12, 2013.

• [B05, Chapter VI]: L. Bourdin. Contributions au calcul des variations et au principe du max-imum de Pontryagin en calculs time scale et fractionnaire. PhD thesis, University of Pau(France), 2013.

• [B06]: L. Bourdin, T. Odzijewicz, and D. Torres. Existence of minimizers for generalized La-grangian functionals and a necessary optimality condition - Application to fractional varia-tional problems. Differential Integral Equations, 27(7-8):743–766, 2014.

• [B07]: L. Bourdin and D. Idczak. A fractional fundamental lemma and a fractional inte-gration by parts formula - Applications to critical points of Bolza functionals and to linearboundary value problems. Adv. Differential Equations, 20(3-4):213–232, 2015.

1

CHAPTER 1. INTRODUCTION TO FRACTIONAL CALCULUS OF VARIATIONS PROBLEMS

1.1 Introduction

Fractional calculus. The fractional calculus is the mathematical field that deals with the general-ization of the standard notions of integral and derivative to any real order. The fractional calculusseems to be originally introduced in 1695 in a letter written by Leibniz to L’Hospital where he sug-gested to generalize his celebrated formula of the kth-derivative of a product (where k ∈ N∗ is apositive integer) to any positive real number k > 0. In another letter to Bernoulli, Leibniz men-tioned derivatives of general order. Since then, numerous renowned mathematicians introducedseveral notions of fractional operators. We can cite the works of Euler (1730’s), Fourier (1820’s),Liouville (1830’s), Riemann (1840’s), Sonin (1860’s), Grünwald (1860’s), Letnikov (1860’s), Caputo(1960’s), etc. All these notions are not disconnected. In most cases it can be proved that two differ-ent notions actually coincide or are correlated by an explicit formula. In the present manuscriptwe will only make use of the Riemann–Liouville and Caputo fractional operators which are themost prevalent notions in the literature. For the reader who is not familiar with these two notions,I refer to Section 1.2 for basic recalls and notations.

For a long time, the fractional calculus was only considered as a pure mathematical branch. In1974, a first conference dedicated to this topic was organized by Ross at the University of NewHaven (Connecticut, USA). Since then, the fractional calculus and its applications experience aboom in several scientific fields. The uses are so varied that it seems difficult to give a completeoverview of the current researches involving fractional operators. We can at least mention that thefractional calculus is widely applied in the physical context of anomalous diffusion, see e.g. [42,73, 77, 95, 96, 98, 99]. Due to the nonlocality of the fractional operators, they are also used in orderto take into account of memory effects, see e.g. [13, 14, 78] where viscoelasticity is modelled bya fractional differential equation. We also refer to studies in wave mechanic [8], economy [26],biology [43, 66], acoustic [46], thermodynamic [50], probability [62], etc. We refer to [49, 84] for alarge panorama of applications of fractional calculus.

Fractional calculus of variations. From the Helmholtz condition [47], it is well known that theoscillator equation with friction cannot be written as a classical Euler–Lagrange equation. In amore general point of view, Riewe [82, 83] raises in 1996 the following problematic:

“It is a strange paradox that the most advanced methods of classical mechanics deal only withconservative systems, while almost all classical processes observed in the physical world are

nonconservative."

For the purpose of finding variational structures for dissipative systems, Riewe has the idea toinclude fractional operators in the calculus of variations. Roughly speaking, Riewe considers thefractional Lagrange functional

L (x) :=∫ b

aL(x(τ),Dα

a+[x](τ), x(τ),τ) dτ,

involving the (left) Riemann–Liouville fractional derivative Dαa+ of order α> 0. In that framework,

the author characterizes the critical points of L as the solutions to the fractional Euler–Lagrangeequation given by

∇1L(x(t ),Dαa+[x](t ), x(t ), t )+Dα

b−[∇2L(x,Dα

a+[x], x, ·)] (t )− d

d t

[∇3L(x,Dαa+[x], x, ·)] (t ) = 0Rn .

The key idea of Riewe lies in the composition Dαb−Dα

a+ (in the middle term) in order to recover the

classical derivative operator dd t (and thus dissipative terms) when α = 1/2. Despite his idea was

relevant, his setting was not totally satisfactory because the composition Dαb− Dα

a+ involves theleft Riemann–Liouville fractional derivative Dα

a+ and the right one Dαb−. Unfortunately no simple

formula allows to express this composition and, in particular, we have D1/2b− D1/2

a+ 6= dd t in general.

2


Hence, in the same spirit, Cresson and Inizan have introduced in [30] a similar but more conclu-sive framework based on the splitting in two of the variable of the fractional Lagrange functional.This framework is called asymmetric. As an application, the authors of [29] obtain an asymmetricfractional variational structure for the convection-diffusion equation.

Since the pioneer works [82, 83] of Riewe, a comprehensive literature has been devoted to nec-essary optimality conditions for fractional calculus of variations problems in several directions,see e.g. [2, 10, 11, 16, 28, 75, 76] and references therein. Concerning the state of the art on frac-tional calculus of variations and corresponding fractional Euler–Lagrange equations, we refer tothe book [68] by Malinowska and Torres.

A first contribution during my PhD thesis: fractional variational integrators. My PhD thesis,under the supervision of Cresson and Greff at the University of Pau (France), started in that contextin September 2010.

Several methods have been proposed in order to find the exact solutions to fractional (partial)differential equations, as Laplace, Mellin or Fourier transforms, see e.g. [58, 74]. However thesemethods cannot be extended to most of nonlinear fractional (partial) differential equations. Asa consequence there has been a growing interest to develop numerical schemes for such equa-tions, see e.g. [5, 34, 35]. The notion of Grünwald–Letnikov fractional derivative is defined as alimit of finite differences and coincides with the Riemann–Liouville’s one on a wide class of func-tions. As a consequence, this notion is particularly suitable in order to define discrete fractionaloperators approximating the Riemann–Liouville fractional derivatives. In order to define a nu-merical scheme for a given (partial) differential equation involving Riemann–Liouville fractionalderivatives, Podlubny [79, 80] substitutes the continuous unknowns by discrete ones and replacesthe Riemann–Liouville fractional derivatives by discrete Grünwald–Letnikov fractional operators.This method is widely used in different fields. We refer for example to [71, 72] for applications onfractional dispersion equations and to [88] for a fractional diffusion equation.

Due to the emergence of a composition between left and right fractional operators, the fractionalEuler–Lagrange equations cannot be solved analytically in general. Consequently, it is of inter-est to develop efficient numerical schemes for such systems. In [17, 19], authors apply the samemethod than Podlubny and provide numerical simulations for fractional Euler–Lagrange equa-tions. Nevertheless, the fractional Euler–Lagrange equations admit a variational structure in thesense that they derive from a calculus of variations on a functional. This structure is intrinsic andinduces strong constraints on the qualitative behaviour of the solutions. It is then important topreserve this structure at the discrete level. However, the numerical schemes previously men-tioned are obtained via a direct discretization, that is an algebraic procedure only based on thedifferential writing of the equation. Consequently, there is no guarantee that the intrinsic varia-tional structure of the equation is preserved.

There exists a suitable method, called variational integrator, in order to build numerical schemesfor classical Euler–Lagrange equations preserving their variational structures. The basic idea is todiscretize the Lagrange functional and to derive the corresponding discrete Euler–Lagrange equa-tion. This method is well studied in [45, 69] for example. My first contribution as a PhD student,in collaboration with Cresson, Greff and Inizan, was to adapt this method to the fractional case.Precisely, using the discrete Grünwald–Letnikov fractional operators, the fractional Lagrange func-tional is discretized and the corresponding discrete fractional Euler–Lagrange equation is derived.This contribution has been the subject of the publications [B01, B02] and will not be more de-veloped in the present manuscript. Note that the so-called fractional variational integrators havenow been developed in several directions by different authors, see e.g. [65, 93].

Existence results for fractional calculus of variations problems. During my PhD thesis, the lit-erature on fractional calculus of variations was already vast, but also young in a certain sense.

3


Indeed, most of articles (only) focused on first-order necessary optimality conditions of Euler–Lagrange type and the functional framework was not discussed in general. The fundamental issueof existence of minimizers for fractional Lagrange functionals was not addressed in the literature(except in very particular cases, see e.g. [54, 59]). In my first alone article [B03], my objective wasto fill this gap in the literature, by providing a rigorous mathematical framework allowing to estab-lish sufficient conditions ensuring the existence of minimizers for fractional Lagrange functionals(and thus of solutions to fractional Euler–Lagrange equations).

Having this objective in mind, my strategy was to follow a very standard approach from infinitedimensional optimization theory. Basically, in order to prove the existence of a minimizer for areal function F : D → R (where D stands for the nonempty definition space of F ), one can pro-vide an appropriate functional framework as follows: give a reflexive Banach space (E,‖ ·‖E) and anonempty constraint set K ⊂ (D∩E) such that:

• F is coercive over K in the sense that

lim‖x‖E→+∞

x∈K

F (x) =+∞;

• K is a weakly closed subset of E;

• F is weakly lower semicontinuous on E in the sense that liminfk→+∞F (xk ) ≥ F (x) for allsequences (xk )k∈N ⊂ E weakly convergent to some x ∈ E.

In that context one can easily prove that the minimization problem minx∈K F (x) admits a solution.

The adaptation of the above approach to classical calculus of variations problems is well knownin the literature. Inspired from the book [31] of Dacorogna, I was able in [B03] to extend it to thefractional case. Precisely, considering Sobolev spaces (eventually fractional ones, see Section 1.4)as reflexive Banach spaces and using general assumptions of integrability/coercivity/convexity,the existence of minimizers for fractional Lagrange functionals was obtained. Furthermore, con-crete sufficient conditions (in terms of quasi-polynomial growths of the Lagrangian function L),which imply the integrability/coercivity requirements, were established. Following the publica-tion [B03], I started a collaboration [B04, B06] with Odzijewicz and Torres from the University ofAveiro (Portugal), and a collaboration [B07] with Idczak from the University of Lodz (Poland), inorder to develop similar techniques for various fractional calculus of variations problems (involv-ing for example several orders and/or different notions of fractional derivatives, and/or differentboundary conditions, etc.). For each problem, the functional framework has to be adapted ac-cordingly, and the difficulties may vary accordingly too.

It is not my aim in this chapter to provide a complete overview of the results obtained in theworks [B03, B04, B06, B07]. In this chapter, I will present in Section 1.3 the study of (only) onefractional calculus of variations problem, extracted from my PhD thesis [B05, Chapter VI], in orderto illustrate the main techniques used in that context. Then, in Section 1.4, I will provide a briefsummary of the various frameworks studied in the works [B03, B04, B06, B07], emphasizing thedifficulties encountered and the technical tools used in order to overcome them.

I conclude this introduction by mentioning that the above techniques have now been extended tovarious fractional frameworks by different authors, see e.g. [15, 67].

1.2 Functional framework and basics from fractional calculus

Throughout this manuscript the abbreviation RL stands for Riemann–Liouville. This section isdevoted to basic definitions and results on RL and Caputo fractional operators. All of the presentedbelow is very standard and mostly extracted from the monographs [58, 85] by Kilbas et al. Thereader already familiar with this topic can skip this section and proceed directly to Section 1.3.

4


We first introduce some functional framework. Let n ∈N∗ be a fixed positive integer and let a < bbe two fixed real numbers. In this chapter, for all real numbers 0 <λ≤ 1, all extended nonnegativeintegers k ∈N∪ +∞ and all extended real numbers 1 ≤ r ≤+∞, we denote by:

• Lr := Lr ([a,b],Rn) the Lebesgue space of r -integrable functions (or, if r =∞, of essentiallybounded functions) defined almost everywhere on [a,b] with values inRn , endowed with itsusual norm ‖ ·‖Lr ;

• W1,r := W1,r ([a,b],Rn) the subspace of Lr of usual Sobolev functions, endowed with its stan-dard norm ‖ ·‖W1,r ;

• C := C([a,b],Rn) the space of continuous functions defined on [a,b] with values in Rn , en-dowed with the uniform norm ‖ ·‖C;

• Hλ := Hλ([a,b],Rn) the subspace of C of λ-Hölder continuous functions;

• AC := AC([a,b],Rn) the subspace of C of absolutely continuous functions;

• Ck := Ck ([a,b],Rn) the subspace of C of k times continuously differentiable functions (or,if k =∞, of infinitely differentiable functions);

• C∞c := C∞

c ([a,b],Rn) the subspace of C∞ of infinitely differentiable functions with compactsupport included in (a,b).

We denote by 1 ≤ r ′ ≤ +∞ the usual conjugate of r defined by the equality 1r + 1

r ′ = 1. As usual,if r = ∞, we consider that 1

r = 0 by convention. Finally, for any functional set E ⊂ C, we denoteby E0 the set of all functions x ∈ E such that x(a) = 0Rn . For example C∞

c ⊂ C∞0 ⊂ AC0 ⊂ C0 ⊂ C.

1.2.1 RL and Caputo fractional operators (left operators)

We start with left fractional integrals and derivatives of RL and Caputo types. In the sequel Γ de-notes the standard Gamma function.

Definition 1.1 (Left RL fractional integral). The left RL fractional integral Iαa+[x] of order α> 0 of afunction x ∈ L1 is defined on [a,b] by

Iαa+[x](t ) :=∫ t

a

(t −τ)α−1

Γ(α)x(τ)dτ,

provided that the right-hand side term exists. For α= 0 we set I0a+[x] := x.

Proposition 1.2 ([58, Lemma 2.1]). If α≥ 0 and x ∈ L1, then Iαa+[x] ∈ L1.

Proposition 1.3 ([58, Lemma 2.3]). If α1 ≥ 0, α2 ≥ 0 and x ∈ L1, then the equalities

Iα1a+

[Iα2

a+[x]]= Iα1+α2

a+ [x] = Iα2+α1a+ [x] = Iα2

a+[

Iα1a+[x]

],

hold true.

Let α≥ 0 and x ∈ L1. From Proposition 1.2, Iαa+[x](t ) exists for almost every t ∈ [a,b]. Throughoutthe chapter, if Iαa+[x] is equal almost everywhere on [a,b] to a continuous function, then Iαa+[x]is automatically identified to its continuous representative. In that case Iαa+[x](t ) is defined forevery t ∈ [a,b].

Proposition 1.4 ([85, Theorem 3.6]). If α> 0 and x ∈ L∞, then Iαa+[x] ∈ C0.

5


Definition 1.5 (Left RL fractional derivative). We say that x ∈ L1 possesses a left RL fractionalderivative Dα

a+[x] of order 0 ≤α≤ 1 if I1−αa+ [x] ∈ AC. In that case Dα

a+[x] ∈ L1 is defined by

Dαa+[x](t ) := d

d t

[I1−α

a+ [x]]

(t ),

for almost every t ∈ [a,b]. We denote by ACαa+ := ACα

a+([a,b],Rn) the space of all functions x ∈ L1

possessing a left RL fractional derivative Dαa+[x] of order 0 ≤α≤ 1.

Remark 1.6. If α= 1, AC1a+ = AC and D1

a+[x] = x for any x ∈ AC. If α= 0, AC0a+ = L1 and D0

a+[x] = xfor any x ∈ L1.

Proposition 1.7 ([B07, Proposition 5]). Let 0 ≤α≤ 1 and x ∈ L1. Then x ∈ ACαa+ if and only if there

exists (xa , y) ∈Rn ×L1 such that

x(t ) = (t −a)α−1

Γ(α)xa + Iαa+[y](t ),

for almost every t ∈ [a,b]. In that case it holds that xa = I1−αa+ [x](a) and y = Dα

a+[x].

Remark 1.8. From Proposition 1.7, one can observe that a function x ∈ ACαa+, with 0 < α < 1, ad-

mits a singularity at t = a in general. As a consequence, when dealing with problems involving RLfractional derivatives, this feature requires serious attention (see Section 1.4 for example). In or-der to avoid this pitfall, the authors may prefer the notion of Caputo fractional derivatives recalledbelow.

Definition 1.9 (Left Caputo fractional derivative). We say that x ∈ C possesses a left Caputo frac-tional derivative cDα

a+[x] of order 0 ≤ α≤ 1 if x − x(a) ∈ ACαa+. In that case cDα

a+[x] ∈ L1 is definedby

cDαa+[x](t ) := Dα

a+[x −x(a)](t ),

for almost every t ∈ [a,b]. We denote by cACαa+ := cACα

a+([a,b],Rn) the space of all functions x ∈ Cpossessing a left Caputo fractional derivative cDα

a+[x] of order 0 ≤α≤ 1.

Remark 1.10. Ifα= 1, cAC1a+ = AC and cD1

a+[x] = x for any x ∈ AC. Ifα= 0, cAC0a+ = C and cD0

a+[x] =x −x(a) for any x ∈ C.

Proposition 1.11 ([B10, Proposition 2.5]). Let 0 ≤ α ≤ 1 and x ∈ C. Then x ∈ cACαa+ if and only if

there exists (xa , y) ∈Rn ×L1 such that

x(t ) = xa + Iαa+[y](t ),

for almost every t ∈ [a,b]. In that case the above relation holds replacing xa by x(a) and y by cDαa+[x].

Remark 1.12. Let 0 ≤α≤ 1. Note that cACαa+ = ACα

a+∩C and, ifα 6= 1, the notions of RL and Caputofractional derivatives are correlated by the explicit formula

cDαa+[x](t ) = Dα

a+[x](t )− (t −a)−α

Γ(1−α)x(a),

holding for almost every t ∈ [a,b] and all x ∈ cACαa+.

From the above definitions and propositions, one can easily recover the following well known re-sult.

Proposition 1.13 ([58, Theorem 2.1]). Let 0 ≤ α ≤ 1. Then the inclusion AC ⊂ cACαa+ holds true

with cDαa+[x] = I1−α

a+ [x] for any x ∈ AC.

Remark 1.14. Let 0 ≤α≤ 1. From Definition 1.5 and Proposition 1.13, one should remember thatthe RL fractional derivative satisfies the composition Dα

a+ = dd t I1−α

a+ , while the Caputo fractional

derivative satisfies the reverse composition cDαa+ = I1−α

a+ dd t .

6


1.2.2 RL and Caputo fractional operators (right operators)

This section is devoted to the definitions of right fractional integrals and derivatives of RL andCaputo types.

Definition 1.15 (Right RL fractional integral). The right RL fractional integral Iαb−[x] of order α> 0of x ∈ L1 is defined on [a,b] by

Iαb−[x](t ) :=∫ b

t

(τ− t )α−1

Γ(α)x(τ)dτ,

provided that the right-hand side term exists. For α= 0 we define I0b−[x] := x.

Definition 1.16 (Right RL fractional derivative). We say that x ∈ L1 possesses a right RL fractionalderivative Dα

b−[x] of order 0 ≤α≤ 1 if I1−αb− [x] ∈ AC. In that case Dα

b−[x] ∈ L1 is defined by

Dαb−[x](t ) :=− d

d t

[I1−α

b− [x]]

(t ),

for almost every t ∈ [a,b]. We denote by ACαb− := ACα

b−([a,b],Rn) the set of all functions x ∈ L1

possessing a right RL fractional derivative Dαb−[x] of order 0 ≤α≤ 1.

Definition 1.17 (Right Caputo fractional derivative). We say that x ∈ C possesses a right Caputofractional derivative cDα

b−[x] of order 0 ≤ α ≤ 1 if x − x(b) ∈ ACαb−. In that case cDα

b−[x] ∈ L1 isdefined by

cDαb−[x](t ) := Dα

b−[x −x(b)](t ),

for almost every t ∈ [a,b]. We denote by cACαb− := cACα

b−([a,b],Rn) the set of all functions x ∈ Cpossessing a right Caputo fractional derivative cDα

b−[x] of order 0 ≤α≤ 1.

Each result and remark stated in Section 1.2.1 (for left operators) has a right-counterpart version.I refer the reader to [58, 85] for details.

1.2.3 Some useful properties

In the literature on fractional calculus, it is well known that, for all α ≥ 0 and all 1 ≤ r ≤ +∞, theRL fractional integral Iαa+ is a linear continuous operator from Lr to Lr , see e.g. [58, Lemma 2.1]. Inwhat follows this property is denoted by Iαa+[Lr ] ,→ Lr . More sophisticated estimations are knownin the literature and are recalled in the next proposition (see [85, Theorem 3.6]). They are of crucialimportance in order to derive sufficient conditions (the less restrictive as possible) ensuring theexistence of minimizers for fractional Lagrange functionals (see Section 1.3.2 for details).

Proposition 1.18 (Estimations). Let 0 <α< 1 and 1 < r <+∞. The following statements are satis-fied:

(i) If 0 <α< 1r < 1, then Iαa+[Lr ] ,→ Ls for all 1 ≤ s ≤ r

1−αr ;

(ii) If 0 <α= 1r < 1, then Iαa+[Lr ] ,→ Ls for all 1 ≤ s <+∞;

(iii) If 0 < 1r <α< 1, then Iαa+[Lr ] ,→ Hα−(1/r )

0 .

From Proposition 1.18, one should remember that the larger r is and/or the closer α is to 1, thenthe more integrable (even regular) the RL fractional integral is. We conclude this section with thefollowing well known fractional integration by parts formula (see, e.g., [85, p.34]) which plays acentral role in the derivation of first-order necessary optimality conditions of Euler–Lagrange typein fractional calculus of variations problems.

7


Proposition 1.19 (Fractional integration by parts formula). Let α > 0. If x1 ∈ Lr1 and x2 ∈ Lr2

with (1/r1)+ (1/r2) < 1+α, then it holds that∫ b

a⟨Iαa+[x1](τ), x2(τ)⟩Rn dτ=

∫ b

a⟨x1(τ), Iαb−[x2](τ)⟩Rn dτ.

Remark 1.20. The above fractional integration by parts formula, making involved both left andright RL fractional integrals, is at the origin of the emergence of both left and right fractionalderivatives in fractional Euler–Lagrange equations (see Introduction for example).

1.3 Study of a fractional calculus of variations problem

Let 0 < α < 1 and 1 < r < +∞ be fixed. Recall that W1,r is a reflexive Banach space and that theinclusion W1,r ⊂ cACα

a+ holds true from Proposition 1.13. Our main goal in this section is to givesufficient conditions ensuring the existence of a minimizer for the fractional Lagrange functionalgiven by

L : K ⊂ W1,r −→ R

x 7−→ L (x) :=∫ b

aL(x(τ), cDα

a+[x](τ), x(τ),τ) dτ,

involving the (left) Caputo fractional derivative cDαa+ of order α, where K := x ∈ W1,r | x(a) = xa,

with the initial condition xa ∈Rn being fixed, and where the Lagrangian function L : (Rn)3×[a,b] →R is assumed to be continuous and of class C1 with respect to its first three variables.

1.3.1 A fractional Tonelli-type theorem

From the Rellich–Kondrachov theorem, note that K is a weakly closed subset of W1,r . Using ad-ditional assumptions of integrability/coercivity/convexity (recalled below), we are able in Theo-rem 1.23 to state a fractional analogue of the classical Tonelli theorem, ensuring the existence of aminimizer for L .

Definition 1.21 (Integrability). The Lagrangian funtion L and its gradients are said to be (α,r )-integrable if, for all x ∈ W1,r , we have:

• L(x, cDαa+[x], x, ·) ∈ L1, ∇1L(x, cDα

a+[x], x, ·) ∈ L1 and ∇3L(x, cDαa+[x], x, ·) ∈ Lr ′

;

• ∇2L(x, cDαa+[x], x, ·) ∈ Ls for some 1 ≤ s ≤+∞ satisfying

– s ≥ r(2−α)r−1 if (1−α)r < 1;

– s > 1 if (1−α)r = 1;

– s ≥ 1 if (1−α)r > 1.

Definition 1.22 (Coercivity). The Lagrange funtional L is said to be coercive over K if

lim‖x‖W1,r →+∞

x∈K

L (x) =+∞.

Theorem 1.23 (Tonelli). If L and its gradients are (α,r )-integrable, L is coercive over K and thefunction L(·, ·, ·, t ) is convex over (Rn)3 for any t ∈ [a,b], then there exists a minimizer for L .

A list of comments is in order.

Remark 1.24. Theorem 1.23 can be found in [B05, Theorem VI.1]. The integrability/convexity as-sumptions, combined with the estimations of Proposition 1.18, are instrumental in order to obtainthe weak lower semicontinuity of L over W1,r .

Remark 1.25. Note that the larger r is and/or the closer α is to zero, then the less restrictivethe (α,r )-integrability assumption in Theorem 1.23 is.

8


Remark 1.26. The integrability/coercivity assumptions in Theorem 1.23 are abstract and cannotbe checked in a quick and easy way. For this reason we propose in Section 1.3.2 some concretesufficient conditions in terms of quasi-polynomial growths of the Lagrangian function L and itsgradients. Examples are then provided in Section 1.3.4.

Remark 1.27. The convexity assumption in Theorem 1.23 is quite restrictive. In this remark I em-phazise that this hypothesis can be relaxed by assuming stronger assumptions on the continuityof L (in terms of uniform equicontinuity), and by invoking the Ascoli theorem. Precisely:

• If (L(·, vα, v, t ))(vα,v,t )∈(Rn )2×[a,b] is uniformly equicontinuous, then the convexity hypothesisin Theorem 1.23 can be relaxed, by assuming (only) that L(x, ·, ·, t ) is convex for any (x, t ) ∈Rn ×[a,b]. In that context we precise that the integrability assumption ∇1L(x, cDα

a+[x], x, ·) ∈L1 for any x ∈ W1,r is superfluous.

• If (1−α)r > 1 and (L(·, ·, v, t ))(v,t )∈Rn×[a,b] is uniformly equicontinuous, then the convexityhypothesis in Theorem 1.23 can be relaxed, by assuming (only) that L(x, vα, ·, t ) is convexfor any (x, vα, t ) ∈ (Rn)2 × [a,b]. In that context we precise that the integrability assump-tions ∇1L(x, cDα

a+[x], x, ·) ∈ L1 and ∇2L(x, cDαa+[x], x, ·) ∈ Ls for any x ∈ W1,r are superfluous.

We refer to [B05, Section VI.4] for detailed proofs and discussions on that point. Examples areprovided in Section 1.3.4.

1.3.2 Quasi-polynomial growths for integrability and coercivity

Taking advantage of the estimations provided in Proposition 1.18 and using basic Hölder inequal-ities, our aim in this section is to provide (concrete) sufficient conditions, in terms of quasi-polyn-omial growths of the Lagrangian function L and its gradients, which guarantee that the (abstract)integrability/coercivity assumptions in Theorem 1.23 are satisfied. To this aim we introduce, forall R ≥ 1, the following set PR of quasi-polynomial functions.

Definition 1.28 (Set of quasi-polynomial functions). Let R ≥ 1.

• If (1−α)r < 1, the set PR stands for the set of functions P : (Rn)3 × [a,b] −→R+ written as

P (x, vα, v, t ) =N∑

k=1ϕk (x, t )‖vα‖dα,k

Rn ‖v‖d1,k

Rn ,

for all (x, vα, v, t ) ∈ (Rn)3 × [a,b], with dα,k (1− (1−α)r )+d1,k ≤ r /R;

• If (1−α)r = 1, the set PR stands for the set of functions P : (Rn)3 × [a,b] −→R+ written as

P (x, vα, v, t ) =N∑

k=1ϕk (x, t )‖vα‖dα,k

Rn ‖v‖d1,k

Rn ,

for all (x, vα, v, t ) ∈ (Rn)3×[a,b], with dα,k = 0 and d1,k ≤ r /R, or with dα,k 6= 0 and d1,k < r /R;

• If (1−α)r > 1, the set PR stands for the set of functions P : (Rn)3 × [a,b] −→R+ written as

P (x, vα, v, t ) =N∑

k=1ϕk (x, vα, t )‖v‖d1,k

Rn ,

for all (x, vα, v, t ) ∈ (Rn)3 × [a,b], with d1,k ≤ r /R;

where N ∈N∗,ϕk are nonnegative continuous real functions and dα,k , d1,k ∈R+ for all k = 1, . . . , N .

Proposition 1.29 (Sufficient condition for integrability). If there exist P ∈ P1, P1 ∈ P1, P2 ∈ P s

and P3 ∈Pr ′ such that

|L(x, vα, v, t )| ≤ P (x, vα, v, t ) and ∀i = 1,2,3, ‖∇i L(x, vα, v, t )‖Rn ≤ Pi (x, vα, v, t ),

for all (x, vα, v, t ) ∈ (Rn)3 × [a,b] and for some 1 ≤ s ≤+∞ satisfying the same conditions as in Defi-nition 1.21, then L and its gradients are (α,r )-integrable.

9


In order to provide a concrete sufficient condition which guarantees the coercivity of L over K, wetake advantage of the affine domination of the term ‖x‖Lr on both the terms ‖x‖Lr and ‖cDα

a+[x]‖Lr

for all x ∈ K. We refer to [B05, Section VI.2.3] for a detailed discussion on that point. We get thefollowing proposition.

Proposition 1.30 (Sufficient condition for coercivity). If there exist c0 > 0, N ∈N∗, ck ∈R and d0,k ,dα,k , d1,k ∈R+, satisfying d0,k +dα,k +d1,k < r for all k = 1, . . . , N , such that

L(x, vα, v, t ) ≥ c0‖v‖rRn +

N∑k=1

ck‖x‖d0,k

Rn ‖vα‖dα,k

Rn ‖v‖d1,k

Rn ,

for all (x, vα, v, t ) ∈ (Rn)3 × [a,b], then L is coercive over K.

The proofs of Propositions 1.29 and 1.30 can be found in [B05, Sections VI.2.2 and VI.2.3]. In Sec-tion 1.3.4 below, examples of Lagrangian functions satisfying the hypotheses of Propositions 1.29and 1.30 are provided.

1.3.3 First-order necessary optimality condition of Euler–Lagrange type

As mentioned in Introduction, during my PhD thesis, most of articles in fractional calculus ofvariations (only) focused on first-order necessary optimality conditions of Euler–Lagrange typeand the functional framework was not discussed in general. When it was, the authors essen-tially considered the functional space C1 which is suitable in order to derive necessary optimalityconditions (in order to benefit from the boundedness of the involved functions and to apply theLebesgue dominated convergence theorem). On the other hand, it is well known that the func-tional space C1, due to its lack of reflexivity, is not appropriate in order to establish existence re-sults.

As we have seen in Theorem 1.23, the functional setting considered in this chapter allows to deriveexistence results for minimizers of fractional Lagrange functionals. Nevertheless we must be surethat this setting is also suitable (as the functional space C1 is) in order to derive first-order nec-essary optimality conditions of Euler–Lagrange type. The aim of this section is to provide a char-acterization of the critical points of L , that is, of the functions x ∈ K such that the DL (x)(w) = 0for all w ∈ C∞

c , where DL (x)(w) stands for the Gâteaux-differential of L at x in the direction w .Assuming that the assumptions of Proposition 1.29 are satisfied (for integrable dominations), us-ing the Lebesgue dominated convergence theorem and the fractional integration by parts formularecalled in Proposition 1.19, we prove in [B05, Theorem VI.2] the next theorem.

Theorem 1.31. Assume that L satisfies the assumptions of Proposition 1.29. Let x ∈ K. Then x is acritical point of L if and only if x satisfies the fractional Euler–Lagrange equation given by

∇1L(x(t ), cDαa+[x](t ), x(t ), t )− d

d t

[I1−α

b−[∇2L(x, cDα

a+[x], x, ·)]+∇3L(x, cDαa+[x], x, ·)

](t ) = 0Rn ,

for almost every t ∈ [a,b].

We conclude our theoretical study with the following corollary (which directly follows from Theo-rems 1.23 and 1.31).

Corollary 1.32. Assume that L satisfies the assumptions of Proposition 1.29, that L is coerciveover K and L(·, ·, ·, t ) is convex over (Rn)3 for any t ∈ [a,b], then there exists a minimizer for L whichis moreover a solution to the fractional Euler–Lagrange equation given in Theorem 1.31.

Remark 1.33. As in Remark 1.27, the convexity assumption in Corollary 1.32 is quite restrictive.Nevertheless this hypothesis can be relaxed by assuming stronger assumptions on the continuityof L (in terms of uniform equicontinuity), and by invoking the Ascoli theorem. This remark allowsto handle examples with lack of convexity in the next section.

10


1.3.4 Some illustrative examples

In this section we present several illustrative examples in order to give to the reader an insight ofwhat class of Lagrangian functions can be handled from our theoretical study. We start with somequasi-polynomial convex Lagrangian functions (see Examples 1.34 and 1.35) and we conclude thissection with two examples with lack of convexity (see Examples 1.36 and 1.37).

Firstly, as in the classical theory, the expression of the Lagrangian function L fixes the coeffi-cient 1 < r < +∞ in order to ensure the coercivity of L over K (using Proposition 1.30). Sec-ondly, in the present fractional setting, we can adjust the coefficient 0 < α < 1 in order to ensurethe (α,r )-integrability of L and its gradients (using Proposition 1.29). In practice we observe thatthe smaller r is, then the closer to zero α needs to be. Note that, from the second item in Re-mark 1.27, when the Lagrangian function L is convex only in its third variable v , then our assump-tions directly impose the restriction (1−α)r > 1 (see Example 1.37).

Example 1.34. Consider the basic polynomial convex Lagrangian function:

L(x, vα, v, t ) = 1

2(‖x‖2

Rn +‖vα‖2Rn +‖v‖2

Rn ),

for all (x, vα, v, t ) ∈ (Rn)3×[a,b]. In order to satisfy the assumptions of Proposition 1.30, we fix r = 2.Then one can see that the assumptions of Proposition 1.29 are satisfied for every 0 < α< 1. FromCorollary 1.32, for every 0 <α< 1, we get that L admits a minimizer x ∈ K which satisfies moreoverthe fractional Euler–Lagrange equation

x(t )− d

d t

[I1−α

b−[

cDαa+[x]

]+ x]

(t ) = 0Rn ,


Example 1.35. Consider the quasi-polynomial convex Lagrangian function:

L(x, vα, v, t ) = ‖x‖d0Rn +‖vα‖dα

Rn +‖v‖rRn ,

for all (x, vα, v, t ) ∈ (Rn)3 × [a,b], with d0, dα > 1 and r > 1. The coercivity of L is directly satisfiedfrom Proposition 1.30. Then one can see that the assumptions of Proposition 1.29 are satisfiedfor every 0 < α < 1 satisfying 0 < α ≤ r+(r−1)dα

r dα. We deduce that Corollary 1.32 can be applied

for every 0 < α < 1 such that 0 < α ≤ r+(r−1)dαr dα

. Moreover, since 1r ′ ≤ r+(r−1)dα

r dαfor every dα > 1,

Corollary 1.32 can be applied for every 0 < α ≤ 1r ′ independently of the value of dα > 1. Let us

develop two particular cases:

• If r = 2, Corollary 1.32 can be applied for every 0 < α < 1 such that 0 < α ≤ 2+dα2dα

and, inparticular, for every 0 < α ≤ 1/2 independently of the value of dα > 1. In the case dα = 2,Corollary 1.32 can be applied for every 0 < α < 1. In the case dα = 4, Corollary 1.32 can beapplied for every 0 <α≤ 3/4.

• If r = 4, Corollary 1.32 can be applied for every 0 < α < 1 such that 0 < α ≤ 4+3dα4dα

and, inparticular, for every 0 < α ≤ 3/4 independently of the value of dα > 1. In the case dα = 4,Corollary 1.32 can be applied for every 0 < α < 1. In the case dα = 8, Corollary 1.32 can beapplied for every 0 <α≤ 7/8.

Example 1.36. Let n = 1 and consider the Lagrangian function:

L(x, vα, v, t ) = 1

4(sin(x)+ v4

α+ v4),

for all (x, vα, v, t ) ∈ R3 × [a,b], which is not convex in its first variable. Taking r = 4, the coercivityof L follows from Proposition 1.30. Then one can see that the assumptions of Proposition 1.29 aresatisfied for every 0 <α< 1. Since (L(·, vα, v, t ))(vα,v,t )∈R2×[a,b] is clearly uniformly equicontinuous,we deduce from Remark 1.33 that the conclusions of Corollary 1.32 are valid for every 0 <α< 1.

11


Example 1.37. Let n = 1 and consider the Lagrangian function:

L(x, vα, v, t ) = sin(vα)−√

1+x2 + 1

2v2,

for all (x, vα, v, t ) ∈ R3 × [a,b], which is not convex in both its first two variables. Taking r = 2, thecoercivity of L follows from Proposition 1.30. Then one can see that the assumptions of Propo-sition 1.29 are satisfied for every 0 < α < 1. Since (L(·, ·, v, t ))(v,t )∈R×[a,b] is equicontinuous andtaking 0 < α < 1/2 (see this assumption in the second item of Remark 1.27), we deduce from Re-mark 1.33 that the conclusions of Corollary 1.32 are valid for every 0 <α< 1/2.

1.4 Related works and perspectives

In Section 1.3 we have developed in details the study of one typical fractional calculus of variationsproblem in order to give illustrations of the main techniques employed in the works [B03, B04,B05, B06, B07] to obtain the existence of minimizers for fractional Lagrange functionals and thusof solutions to fractional Euler–Lagrange equations. The aim of this section is to provide a briefsummary of the various frameworks studied in these works, emphasizing the specific difficultiesencountered and the technical tools used in order to overcome them.

Removing the dependence in the classical derivative. Let 0 < α < 1 and 1 < r < +∞ be fixed.A first extension of the work presented in Section 1.3 is to remove the dependence of L in theclassical derivative. Precisely, in this paragraph, we consider the minimization of the fractionalLagrange functional given by

L (x) :=∫ b

aL(x(τ), cDα

a+[x](τ),τ) dτ,

which does not involve the classical derivative x. In that situation one cannot expect to use Propo-sition 1.30 in order to get coercivity of L over K. It turns out that the classical Sobolev space W1,r

is not suitable in order to develop the strategy presented in Section 1.3, and another reflexive Ba-nach space has to be considered. To this aim we introduce the so-called Caputo fractional Sobolevspace

cACα,ra+ := x ∈ cACα

a+ | cDαa+[x] ∈ Lr ,

endowed with the norm‖x‖

cACα,ra+ := ‖x‖Lr +‖cDα

a+[x]‖Lr ,

for all x ∈ cACα,ra+ . It can be proved that, if 0 < 1

r < α < 1, then cACα,ra+ is a reflexive Banach space

(see [B03, Section 3] for a similar statement). As a consequence, using the functional space cACα,ra+ ,

a similar method than the one presented in Section 1.3 can be developed but (only) under therestriction 0 < 1

r <α< 1.

An interesting phenomenon emerges from the conditions onα and r in that context. Indeed, notethat, as in Section 1.3, the larger r is, then the better is. But, in contrast to Section 1.3 (where thecloserα is to zero, then the better is), in the present situation we have: the closerα is to 1, then thebetter is.

Replacing the Caputo fractional derivative with a RL fractional derivative. Let 0 <α< 1 and 1 <r < +∞ be fixed. An extension of the work presented in the previous paragraph is to replace theCaputo fractional derivative cDα

a+ by a RL fractional derivative Dαa+. Precisely, in this paragraph,

we consider the minimization of the fractional Lagrange functional given by

L (x) :=∫ b

aL(x(τ),Dα

a+[x](τ),τ) dτ.

12


In that situation, similarly to the previous paragraph, one should introduce and use the so-calledRL fractional Sobolev space ACα,r

a+ which is a reflexive Banach space when 0 < 1r <α< 1. Following

Proposition 1.7, it is also more appropriate to consider the constraint set

K := x ∈ ACα,ra+ | I1−α

a+ [x](a) = xa,

where xa ∈ Rn is fixed, in order to guarantee the affine domination of the term ‖Dαa+[x]‖Lr on the

term ‖x‖Lr for all x ∈ K.

An emergent difficulty in the present setting concerns the integrability of the functions x ∈ ACα,ra+ .

Indeed, when dealing with Caputo fractional derivatives, we know that the functions x ∈ cACα,ra+

are continuous, and thus bounded. A contrario, when dealing with RL fractional derivatives, it iswell known (see Remark 1.8) that the functions x ∈ ACα,r

a+ admits a singularity at t = a in general.As a consequence precautions should be taken. Precisely the coefficients considered in the quasi-polynomial growths of the Lagrangian function have to be modified accordingly. I refer to [B07,Section 4.1] for details.

Considering a fractional Bolza functional. Let 0 < α < 1 and 1 < r < +∞ be fixed. Duringmy PhD thesis, most of articles dealt with first-order necessary optimality conditions of Euler–Lagrange type for fractional Lagrange functionals. However no study was written about the mini-mization of fractional functionals of Bolza form. In this paragraph we consider the fractional Bolzafunctional given by

B(x) := g (I1−αa+ [x](a), x(b))+

∫ b

aL(x(τ),Dα

a+[x](τ),τ) dτ,

which makes involved boundary terms in the additional Mayer cost g (I1−αa+ [x](a), x(b)) (where the

real function g :Rn ×Rn →R is assumed to be of class C1). Since boundary conditions are involvedin the cost, we take K = ACα

a+ being the whole space.

A major difficuty emerges in that context when one aims to derive the corresponding necessaryoptimality conditions. Due to the technical point mentioned in Remark 1.14, it turns out thatthe fractional integration by parts formula recalled in Proposition 1.19, which is suitable in orderto deal with Caputo fractional derivative cDα

a+ = I1−αa+ d

d t , is not appropriate for dealing with the

RL fractional derivative Dαa+ = d

d t I1−αa+ and perturbations w ∈ C∞ (we do not consider perturba-

tions w ∈ C∞c in the present Bolza context). In order to overcome this difficulty, we prove in [B07,

Theorem 2] a new fractional integration by parts formula recalled in the next proposition.

Proposition 1.38 (Fractional integration by parts formula). If x1 ∈ ACα,r1a+ and x2 ∈ ACα,r2

b− with 0 <1r1<α< 1 and 0 < 1

r2<α< 1, then it holds that

∫ b

a⟨Dα

a+[x1](τ), x2(τ)⟩Rn dτ=∫ b

a⟨x1(τ),Dα

b−[x2](τ)⟩Rn dτ

+⟨x1(b), I1−αb− [x2](b)⟩Rn −⟨I1−α

a+ [x1](a), x2(a)⟩Rn .

Hence, under the restriction 0 < 1r <α< 1 (again) and using the functional space ACα

a+, the abovefractional integration by parts formula has been successfully applied in the collaboration [B07]in order to derive, not only the fractional Euler–Lagrange equation corresponding to the presentcontext, but also (new) boundary conditions of the form( ∇2L(x(a),Dα

a+[x](a), a)−I1−α

b− [∇2L(x,Dαa+[x], ·)](b)

)=∇g (I1−α

a+ [x](a), x(b)),

which are called transversality conditions.

13


Perspectives and open challenges. As we have seen in this chapter, the frameworks introducedand used in the works [B03, B04, B05, B06, B07] allow to establish sufficient conditions ensuringthe existence of solutions to fractional calculus of variations problems. Nevertheless we observe acompetition between the coefficients 0 <α< 1 and 1 < r <+∞. When the conditions on α and rare not satisfied, the existence of solutions is not guaranteed and remains an open question. Inparticular, when using the fractional Sobolev spaces ACα,r

a+ and cACα,ra+ or the fractional integration

by parts formula recalled in Proposition 1.38, our method is restricted to the case 0 < 1r < α < 1.

Therefore, a challenging issue, which deserves a particular attention, concerns the pathologicalcase 0 <α≤ 1

r < 1

The objective of the works [B03, B04, B05, B06, B07] was to provide sufficient conditions ensuringthe existence of solutions to fractional Euler–Lagrange equations. The method was based on theclassical approach in infinite dimensional optimization theory evoked in Introduction. In particu-lar, for the purpose of weak lower semicontinuity, I was led to consider (quite restrictive) convexityassumptions. Nevertheless there exist many standard methods in order to prove the existence ofsolutions to classical variational problems. The extensions of these methods to the fractional caseconstitute as many mathematical challenges. For example, I mention that another strategy hasbeen explored in the collaboration [B07] in order to obtain the existence of a solution to a fractionalboundary value problem. Based on the fractional integration by parts formula recalled in Proposi-tion 1.38, we were able to give an Hilbert structure to the functional space ACα,2

a+ , when 1/2 <α< 1,and to prove an existence result thanks to the Stampacchia theorem. I refer to [B07, Section 5] formore details.

A longer-term perspective for the community lies in the asymmetric fractional framework intro-duced by Cresson and Inizan [29, 30] and evoked in Introduction. However this setting is basedon the splitting in two of the variable of the fractional Lagrange functional. As a consequence theclassical variational approaches cannot be applied and new strategies have to be designed and set-tled in order to deal with nonconservative systems with the help of fractional operators. A recentwork [55] has been provided by Jiménez and Ober-Blöbaum in that direction.

14

Chapter 2

Pontryagin maximum principle forCaputo fractional optimal controlproblems


2.2 Preliminaries on RL and Caputo fractional Cauchy problems . . . . . . . . . . . . 19

2.2.1 Framework and basic assumptions . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2.2 Results on the Caputo fractional Cauchy problem (cCP) . . . . . . . . . . . . 20

2.2.3 Results on the RL fractional Cauchy problem (CP) . . . . . . . . . . . . . . . 21

2.2.4 Some extensions in [B08] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3.1 Caputo fractional optimal control problem: terminology and assumptions . 23

2.3.2 Preliminaries: sensitivity analysis of Caputo fractional Cauchy problems . . 24

2.3.3 Filippov existence theorem and Pontryagin maximum principle . . . . . . . 26

2.3.4 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.4 Applications to fractional calculus of variations . . . . . . . . . . . . . . . . . . . . 30

2.4.1 Legendre condition for a Caputo fractional calculus of variations problem . 30

2.4.2 Obstructions for the usual argument in the fractional setting . . . . . . . . . 31

2.4.3 Bonus: discussion on some non-existence results . . . . . . . . . . . . . . . . 31

The present chapter summarizes the contributions of the four following references:

• [B08]: L. Bourdin. Cauchy–Lipschitz theory for fractional multi-order dynamics: state-tran-sition matrices, Duhamel formulas and duality theorems. Differential Integral Equations,31(7-8):559–594, 2018.

• [B09]: L. Bourdin. Weighted Hölder continuity of Riemann–Liouville fractional integrals -Application to regularity of solutions to fractional Cauchy problems with Carathéodory dy-namics. Fract. Calc. Appl. Anal., 22(3):722–749, 2019.

• [B10]: M. Bergounioux and L. Bourdin. Pontryagin maximum principle for general Caputofractional optimal control problems with Bolza cost and terminal constraints. ESAIM Con-trol Optim. Calc. Var. (to appear), 2019.

• [B11]: L. Bourdin and R. Ferreira. First and second-order necessary optimality conditionsfor Bolza functionals with Caputo fractional derivatives and general mixed initial/final con-straints. Submitted, 2019.

15

CHAPTER 2. PMP FOR CAPUTO FRACTIONAL OPTIMAL CONTROL PROBLEMS

This chapter preserves the notations of fractional calculus introduced in the previous Chapter 1.

2.1 Introduction

Optimal control theory. Optimal control theory is the mathematical field concerned with theanalysis of controlled dynamical systems, where one aims at steering such a system from a givenconfiguration to some desired target, under some given constraints, and by minimizing a givencriterion. Most of the literature focuses on dynamical systems driven by ordinary differentialequations. In that framework, the Filippov theorem, established in 1959, ensures the existenceof at least one optimal trajectory under some appropriate compactness/convexity hypotheses(see [38]). On the other hand, the Pontryagin Maximum Principle (in short, PMP), establishedat the end of the fifties (see [20], and see [40] for the history of this discovery), is the milestone ofthe optimal control theory. It provides first-order necessary optimality conditions and reduces thesearch of optimal trajectories to a boundary value problem. Roughly speaking, the PMP ensuresthe existence of an adjoint vector (also called costate vector) which satisfies some terminal condi-tions (called transversality conditions) and such that the optimal control maximizes the Hamilto-nian associated with the optimal control problem.

Optimal control theory, and in particular the PMP, has a wide field of applications in various do-mains. I refer the reader to textbooks such as [1, 22, 23, 24, 48, 56, 61, 86, 87, 89, 92] for theoreticalresults and/or practical applications, essentially for dynamical systems described by ordinary dif-ferential equations. From the point of view of calculus of variations, the PMP corresponds to anextension of the Euler–Lagrange equation. Actually, for unconstrained optimal control problems,a weak version of the PMP (in which the Hamiltonian maximization condition is replaced by aweaker null Hamiltonian gradient condition) can be derived from a simple calculus of variationsapproach (see, e.g., [64, Section 3.4]). However, obtaining the strong version of the PMP, whichmoreover handles constraints on the terminal state values and/or on the control values, requiresmore sophisticated mathematical tools such as the sensitivity analysis of the state equation underappropriate control perturbations (needle-like variations for instance) combined with a Brouwerfixed point argument (see, e.g., [48, 61]) or with Ekeland variational principle (see, e.g., [36, 63]).Many other variants exist in the literature (based on an implicit function theorem [1], Hahn–Banach separation theorem [22], or Aubin mini-max theorem [92] for example).

Fractional optimal control theory and technical difficulties. Compared to the vast literatureon fractional calculus of variations (see Introduction of Chapter 1), the fractional optimal controltheory (in which the dynamical system is driven by a fractional differential equation) had at firsta slight development at the beginning of the 21th century. We refer the reader to [3, 4, 39, 44, 53]and references therein for some initiating works. These articles constitute a first step in the field,and essentially use calculus of variations approaches in order to establish weak versions of thePMP for unconstrained fractional optimal control problems. In my PhD thesis [B05, Chapter VII],a similar method is proposed for a general nonlinear Caputo fractional optimal control problem.However all these results are not satisfactory for several reasons. Firstly, no constraint can be han-dled, neither on the terminal state values, neither on the control values. Secondly the Hamiltonianmaximization condition is not obtained, and replaced by the weaker null Hamiltonian gradientcondition. This second feature is due to the calculus of variations approaches used in the aboveworks, where the authors consider global L∞-perturbations of the control while, as is well knownin the classical optimal control theory, local L1-perturbations of the control (such as needle-likevariations) should be considered in order to derive the Hamiltonian maximization condition. Butthere is an underlying reason: the extension of local control perturbations to the fractional settinginduces technical difficulties due to the nonlocality of the fractional operators and, therefore, theobtention of variation vectors associated to needle-like variations of the control is not a simpletask. Moreover, due to the lack of a (simple) fractional version of the Leibniz formula, the defini-tion of a suitable adjoint vector (associated with the variation vectors) is not trivial either. Actu-

16


ally, the derivation of a strong version of the PMP for fractional optimal control problems, whichcan moreover handle constraints on the terminal state values and on the control values, appearsin 2013 as a priority perspective in the conclusion of my PhD thesis (see [B05, Conclusion]).

As a second step in fractional optimal control theory, I mention the works of Kamocki since 2014.Indeed, a first attempt to establish a strong version of the PMP (with Hamiltonian maximizationcondition) in the case of a general RL fractional optimal control problem with a classical Lagrangecost and with control constraints can be found in [57, Theorem 7]. However, several (quite restric-tive) hypotheses are assumed, such as the compactness of the control constraint set, the convex-ity of the set of augmented velocities, the global Lipschitz continuity of the dynamics and somegrowth conditions on the dynamics, the Lagrange cost function and their gradients. Moreover aRL fractional version of the initial condition is fixed, and no other terminal state constraint can behandled. Hence, many challenging questions remained open in that field.

Major contributions. Section 2.3 of the present chapter is extracted from my work [B10] jointlywith Bergounioux from the University of Orléans (France). In this paper we have considered ageneral control system driven by a nonlinear Caputo fractional differential equation of the form

cDαa+[x](t ) = f (x(t ),u(t ), t ),

on a fixed real interval [a,b] with a < b and of fractional order 0 < α ≤ 1. Accordingly to the dis-cussion provided in the previous paragraph, our motivation was to consider a sufficiently generalCaputo fractional optimal control problem in order to handle:

• General control constraint u(t ) ∈ U where U is a nonempty closed set. I refer to Remark 2.31in Section 2.3.3 for a discussion on this closedness assumption.

• General mixed initial/final state constraint ψ(x(a), x(b)) ∈ S where S is a nonempty closedconvex set. To the best of my knowledge, no endpoint constraint has never been consid-ered in the literature on fractional optimal control theory before [B10]. Moreover note thatthe above consideration of terminal state constraint is quite general and allows to encom-pass a lot of typical situations such as fixed initial and/or final conditions, free initial and/orfinal conditions, equality and/or inequality constraints, etc. We refer to Remark 2.36 in Sec-tion 2.3.3 for more details.

• Mayer costs and (classical or fractional) Lagrange costs. Therefore we consider a general

Bolza cost of the form g (x(a), x(b))+ Iβa+[L(x,u, ·)](b) where β≥α (typically β= 1 or β=α).

Moreover the regularity assumptions that we required in [B10] on the maps f , L, g and ψ werereduced as much as possible (as far as I know) to guarantee the correctness of our proofs. In par-ticular, no growth condition, no convexity and no global Lipschitz continuity were imposed. Forthe precise definition of the problem investigated and the corresponding assumptions, I refer thereader to Problem (FP) in Section 2.3.1.

The major contribution of the collaboration [B10] is a strong version of the PMP for Problem (FP)(see [B10, Theorem 3.2]). It is recalled in Theorem 2.30 in Section 2.3.3. Its proof is based on:

• The sensitivity analysis of Caputo fractional Cauchy problems with respect to needle-likevariations of the control and to perturbations of the initial condition. The correspondingresults, involving specific variation vectors, are recalled in Section 2.3.2.

• The consideration of a penalized functional in order to take into account of the terminalstate constraintψ(x(a), x(b)) ∈ S and the application of the Ekeland variational principle [36].

With Bergounioux, we were eager to provide a consistent paper by proving moreover that the ex-istence of optimal controls in some standard settings is preserved at the fractional level. Thus weprovided in [B10, Theorem 3.1] (recalled in Theorem 2.28 in Section 2.3.3) a Filippov-type exis-tence result which, as in the classical case, guarantees the existence of an optimal trajectory forProblem (FP) under some standard compactness/convexity assumptions.

17


Applications of the PMP to fractional calculus of variations. In 2018, Ferreira from the Univer-sity of Lisboa (Portugal) brought to my attention that serious flaws within the proof of second-order Legendre necessary optimality conditions in fractional calculus of variations were dissemi-nated in the literature. Therefore we were eager to elaborate together a correct proof, but it turnsout that the method used in the classical theory cannot be extended to the fractional case. Sec-tion 2.4 of the present chapter is dedicated to the collaboration [B11] with Ferreira in regard to thistopic:

(i) Firstly, during our researches, we were able to establish a proposition which condemns forgood the exact adaptation of the usual argument used for the classical Legendre conditionto the fractional context with final state constraint. I refer to Proposition 2.41 and to Sec-tion 2.4.2 for a discussion on that technical point.

(ii) Secondly, it turns out that the PMP obtained in the work [B10] with Bergounioux (recalled inTheorem 2.30 in Section 2.3) constitutes a complete and correct argument for the Legendrecondition in Caputo fractional calculus of variations problems. I refer to Theorem 2.39 inSection 2.4.1 for details.

(iii) Unexpectedly, the transversality conditions derived in Theorem 2.39 in Section 2.4.1 allowsus to provide some non-existence results for Caputo fractional calculus of variations prob-lems, which contributes to the discussion initiated in my first works as a PhD student (pre-sented in Chapter 1) about the existence (or not) of minimizers to fractional functionals. Irefer to Section 2.4.3 for the details.

Required preliminaries in fractional Cauchy–Lipschitz theory. As expected, the proof of thestrong PMP for Problem (FP) obtained in [B10, Theorem 3.2] required a lot of technical adjust-ments from the classical case. Keeping such a PMP as an objective in mind since my PhD thesis,the preliminary paper [B08] was motivated by the needs of completing the existing literature onfractional Cauchy–Lipschitz theory (also known as Picard–Lindelöf theory). As explained in detailsin [B08, Section 1.1], a vast literature was already dedicated to this topic, but (only) for continuousdynamics which is not an appropriate setting in order to deal with fractional control systems inwhich the control may be discontinuous. Furthermore, some well known results from the classi-cal theory were not extended yet to the fractional setting. For example, the behavior of a nonglobalmaximal solution to a Caputo fractional Cauchy problem was not studied in the literature, whileits unboundedness (see Theorem 2.6) plays a crucial role in order to guarantee that the set of ad-missible controls for globality of a Caputo fractional control system is open (in a certain sense, seeProposition 2.21 in Section 2.3.2 for details). Moreover the nonlocality of the fractional operatorsinduces specific variation vectors under needle-like variations of the control and perturbationsof the initial condition (see Propositions 2.22 and 2.25 in Section 2.3.2). For example, when con-sidering needle-like variations of the control in a Caputo fractional control system, the emergentvariation vectors are solutions to linear RL (and not Caputo) fractional Cauchy problems. As aconsequence they are unbounded in general (this pitfall directly follows from the singularity dis-cussed in Remark 1.8) and estimates of the involved singularities have to be obtained. To this aim,weighted continuity results have been derived in the second preliminary paper [B09]. I refer toRemark 2.11 for some details on that paper. Then, since there is no (simple) fractional analogue tothe Leibniz formula, the introduction of an adjoint vector (associated with the above mentionedvariation vectors) is not trivial. Keeping in mind this difficulty, I extended in [B08, Section 4.1]the notion of state-transition matrix to the fractional setting and obtain new fractional Duhamelformulas (one is recalled in Theorem 2.13) which allow to give a linear expression of the varia-tion vectors. Finally the adjoint vector (associated with these variation vectors) can be introducedthanks to the extension to the fractional setting of the classical duality theorem on state-transitionmatrices derived in [B08, Section 4.2] (and recalled in Theorem 2.16 in Section 2.2.3). All resultsmentioned in this paragraph can be found in the next preliminary Section 2.2.

18


2.2 Preliminaries on RL and Caputo fractional Cauchy problems

This section is extracted from my two papers [B08, B09] which were originally motivated by theneeds of completing the existing literature on fractional differential equations in the view of inves-tigating fractional control systems. The first work [B08] was dedicated to the study of RL and Ca-puto fractional Cauchy problems with Carathéodory dynamics. In there, most of the well knownresults from the classical Cauchy–Lipschitz theory (also known as Picard–Lindelöf theory) is ex-tended to the fractional case. As developed in details in [B08, Section 1.1], a wide literature wasalready dedicated to this topic, but (only) for continuous dynamics which is not an appropriatesetting in order to deal with fractional control systems in which the control may be discontinuous.Hence my major aim in [B08] was to contribute to the development of that field by considering,in particular, general Carathéodory dynamics. The second work [B09] was dedicated to the regu-larity of solutions to fractional Cauchy problems. Indeed, when dealing with RL fractional Cauchyproblems, it is well known that the solutions are not bounded in general (this pitfall directly fol-lows from the singularity discussed in Remark 1.8). However I was able in the article [B09] to provethat these solutions are at least weighted continuous (see Theorem 2.9 for details). This propertyallows to provide an estimation on the emergent singularities at t = a. These estimations are theninstrumental in order to carry out the sensitivity analysis of fractional control systems (in the viewof establishing a fractional version of the Pontryagin maximum principle in Section 2.3).

The aim of this section is not to provide a complete overview of the results obtained in the twoworks [B08, B09]. I will only recall the most essential results for the needs of Section 2.3. Nonethe-less the last subsection (Section 2.2.4) is dedicated to a brief summary of the extensions obtainedin [B08] and to general comments.

2.2.1 Framework and basic assumptions

In the whole Section 2.2 we are interested in the two nonlinear fractional Cauchy problems givenby

Dαa+[x](t ) = F (x(t ), t ),

I1−αa+ [x](a) = xa ,

(CP)

and cDα

a+[x](t ) = F (x(t ), t ),x(a) = xa ,

(cCP)

considered on a compact interval [a,b], with a < b, with a fractional order 0 <α≤ 1, where xa ∈Rn

is fixed and where the dynamics F : Rn × [a,b] → Rn is a general Carathéodory function in thesense that F is continuous with respect to its first variable, and is (only) measurable in its secondvariable. Essentially (and as in the classical theory), the investigations of the above RL fractionalCauchy problem (CP) are based (from Proposition 1.7) on the integral formulation

x(t ) = 1

Γ(α)(t −a)α−1xa + Iαa+[F (x, ·)](t ),

and the investigations of the above Caputo fractional Cauchy problem (cCP) are based (from Propo-sition 1.11) on the integral formulation

x(t ) = xa + Iαa+[F (x, ·)](t ).

In order to state our main results in the next subsections we first need to introduce some basicassumptions on F in the following series of definitions.

Definition 2.1. The dynamics F is said to be preserving the integrability of zero if

F (0Rn , ·) ∈ L1([a,b],Rn). (Hyp1)

In what follows this property will be referred to as (Hyp1).

19


Definition 2.2. The dynamics F is said to be bounded on compacts if, for any compact subset K ⊂Rn , there exists M ≥ 0 such that

‖F (x, t )‖Rn ≤ M , (Hyp∞)

for any x ∈ K and for almost every t ∈ [a,b]. In what follows this property will be referred toas (Hyp∞).

Definition 2.3. The dynamics F is said to be locally Lipschitz continuous in its first variable if, forevery (x, t ) ∈Rn × [a,b], there exist R > 0, δ> 0 and L ≥ 0 such that

‖F (x2,τ)−F (x1,τ)‖Rn ≤ L‖x2 −x1‖Rn , (Hyploc)

for any x1, x2 ∈ BRn (x,R) and for almost every τ ∈ [t −δ, t +δ]∩ [a,b]. In what follows this propertywill be referred to as (Hyploc).

Definition 2.4. The dynamics F is said to be globally Lipschitz continuous in its first variable ifthere exists L ≥ 0 such that

‖F (x2,τ)−F (x1,τ)‖Rn ≤ L‖x2 −x1‖Rn , (Hypglob)

for any x1, x2 ∈ Rn and for almost every τ ∈ [a,b]. In what follows this property will be referred toas (Hypglob).

2.2.2 Results on the Caputo fractional Cauchy problem (cCP)

We first introduce the following notion of local solution to the Caputo fractional Cauchy prob-lem (cCP). As in the classical Cauchy–Lipschitz theory, the notions of extension, maximal andglobal solutions follow.

Definition 2.5. A couple (x, I ) is said to be a local solution to (cCP) if:

• I is an interval such that a( I ⊂ [a,b];

• x : I → Rn is a function such that x ∈ cACαa+([a,c],Rn), with x(a) = xa and cDα

a+[x](t ) =F (x(t ), t ) for almost every t ∈ [a,c], for all c ∈ I \a.

Let (x, I ), (x ′, I ′) be two local solutions to (cCP). Then (x ′, I ′) is said to be an extension (resp. strictextension) of (x, I ) if I ⊂ I ′ (resp. I ( I ′) and x ′ = x on I . We say that (x, I ) is a maximal solutionto (cCP) if it does not admit any strict extension. Finally we say that (x, I ) is a global solutionto (cCP) if I = [a,b].

Theorem 2.6. If F satisfies (Hyp∞) and (Hyploc), then (cCP) has a unique maximal solution (x, I ),which is the maximal extension of any other local solution. Furthermore we have the followingalternative:

• either I = [a,b], that is, (x, I ) is global;

• either I = [a,c), with c ∈ (a,b], and x : I →Rn is unbounded on I .

Finally, if moreover F satisfies (Hypglob), then (x, I ) is global, that is, I = [a,b].

Remark 2.7. The three results stated in Theorem 2.6 can be found in [B08, Section 3.2.2]:

(i) The proof of existence of a local solution is based on the integral formulation mentionedin Section 2.2.1 and on the contraction mapping theorem used on the complete metricset C([a, a + ε],BRn (xa ,R)) (where R > 0 is given in Definition 2.3 and with a sufficientlysmall ε > 0) endowed with the standard uniform distance. The extension to a maximal so-lution is based on Zorn lemma. Finally the uniqueness is established thanks to a fractionalversion of the Gronwall lemma that can be found in [33, 94].

20


(ii) To the best of my knowledge, the alternative (which is well known in the classical theory)was never discussed in the fractional setting (even for continuous dynamics) before [B08].This result plays a fundamental role in the sensitivity analysis of fractional control systemsin order to prove, for example, that the set of controls which are admissible for globality isopen (in a certain sense, see Proposition 2.21 in Section 2.3 for details).

(iii) In the case where F satisfies (Hypglob), the globality of the maximal solution is proved byinkoving the contraction mapping theorem, combined with the use of an equivalent and ap-propriate Bielecki norm on the space C([a,b],Rn). This method is widely inspired from [52].

2.2.3 Results on the RL fractional Cauchy problem (CP)

Due to the singularities at t = a when dealing with left RL fractional derivatives (see Remark 1.8),the use of local estimations is compromised in view of invoking the contraction mapping argu-ment for the existence of a local solution to the RL fractional Cauchy problem (CP). Consequentlythe consideration of a dynamics F which is (only) locally Lipschitz continuous in its first variableremains an open challenge in the literature. Therefore, in this section, we (only) deal with a dy-namics F which is globally Lipschitz continuous in its first variable, and consequently (only) withglobal solutions.

Definition 2.8. A function x : [a,b] →Rn is said to be a (global) solution to (CP) if x ∈ ACαa+([a,b],Rn)

with I1−αa+ [x](a) = xa and Dα

a+[x](t ) = F (x(t ), t ) for almost every t ∈ [a,b].

Theorem 2.9. If F satisfies (Hyp1) and (Hypglob), then (CP) has a unique (global) solution x. Ifmoreover F satisfies (Hyp∞), then ραx ∈ C([a,b],Rn) with ρα(a)x(a) = xa , where ρα is the weightfunction defined by ρα(t ) := Γ(α)(t −a)α for all t ∈ [a,b].

Remark 2.10. The first result stated in Theorem 2.9 can be found in [B08, Section 3.1.2]. Theexistence/uniqueness of a global solution is based on the integral formulation mentioned in Sec-tion 2.2.1 and on the contraction mapping theorem applied on the functional space L1([a,b],Rn)equipped with an equivalent and appropriate Bielecki norm. This method is inspired from [52].

Remark 2.11. Unfortunately the (global) solutions obtained with the method presented in Re-mark 2.10 belong to the set ACα

a+([a,b],Rn) ⊂ L1([a,b],Rn) and are unbounded in general (in par-ticular at t = a). If no estimation is known on the emergent singularities, technical obstructionsare raised in the sensitivity analysis of fractional control systems in Section 2.3. In the second pre-liminary paper [B09], my aim was to prove that these solutions are at least weighted continuous,which allows to provide some estimates of the involved singularities. The second result stated inTheorem 2.9 can be found in [B09, Theorem 5.3]. Its proof is based on several new estimations ofthe form Iαa+[Lr

β] ,→ Hη

γ established in the paper [B09, Section 4], where Lrβ

(resp. Hηγ) stands for a

weighted Lebesgue space (resp. weighted Hölder continuous space) and with various conditionson the parametersα, β, r , η and γ. These estimations can be seen as extensions of the estimationsrecalled in Proposition 1.18 to weighted functional spaces.

When dealing with the sensitivity analysis of nonlinear Caputo fractional Cauchy problems, itturns out that the emergent variation vectors are solutions to linear RL (and not Caputo) frac-tional Cauchy problems (see Proposition 2.22 in Section 2.3.2 for details). Thus the next results ofthis section are dedicated to the special linear case where the dynamics F is expressed as F (x, t ) =A(t )x, where A ∈ L∞([a,b],Rn×n) is an essentially bounded matrix function. In that context ourobjective is to extend the classical Duhamel formula to the fractional setting. To this aim, in thefollowing definition, we first extend the notion of state-transition matrix to the fractional case.

Definition 2.12 (Left RL fractional state-transition matrix). For every s ∈ [a,b), the linear squarematrix RL fractional Cauchy problem given by

Dαs+[Φ](t ) = A(t )Φ(t ),

I1−αs+ [Φ](s) = Idn ,

21


considered on the compact interval [s,b], admits, from Theorem 2.9, a unique (global) solutiondenoted by Φ(·, s) ∈ ACα

s+([s,b],Rn×n). The function Φ(·, ·) (with two variables) is called the left RLfractional state-transition matrix associated to A.

Theorem 2.13 (Fractional Duhamel formula). From Theorem 2.9, the linear RL fractional Cauchyproblem given by

Dαa+[x](t ) = A(t )x(t ),

I1−αa+ [x](a) = xa ,

(2.1)

considered on the compact interval [a,b], has a unique (global) solution x ∈ ACαa+([a,b],Rn). More-

over it satisfies the fractional Duhamel formula x(t ) =Φ(t , a)xa for all t ∈ (a,b], where Φ stands forthe left RL fractional state-transition matrix associated to A.

Remark 2.14. Theorem 2.13 can be found in [B08, Theorem 5]. The proof of its second part isessentially based on technical estimations and on the classical Fubini theorem.

Remark 2.15. Obviously Theorems 2.6, 2.9 and 2.13 have each a right counterpart, that is, similarresults can be derived when dealing with right (and not left) RL and Caputo fractional Cauchyproblems.

When dealing with the sensitivity analysis of nonlinear Caputo fractional Cauchy problems, sincethe variation vectors are solutions to linear RL fractional Cauchy problems, we are eager to definea corresponding adjoint vector. To this aim we recall the next duality theorem (which is provedin [B08, Theorem 7] invoking technical estimations and the classical Fubini theorem) and we statethe corollary which follows. In particular one should notice that these results make intervene right(and not left) RL fractional Cauchy problems.

Theorem 2.16 (Duality theorem). Let Φ be the left RL fractional state-transition matrix associatedto A. Then, for every t ∈ (a,b], the matrix functionΦ(t , ·) is the unique (global) solution to the linearsquare matrix right RL fractional Cauchy problem given by

Dαt−[Φ](s) =Φ(s)A(s),

I1−αt− [Φ](t ) = Idn ,

considered on the compact interval [a, t ].

Corollary 2.17. Let a ≤ s1 < s2 ≤ b. Consider w ∈ ACαs1+([s1,b],Rn) and p ∈ ACα

s2−([a, s2],Rn) beingthe unique (global) solutions to the linear - left and right - RL fractional Cauchy problems given by

Dαs1+[w](t ) = A(t )w(t ),

I1−αs1+ [w](s1) = w1,

and

Dα

s2−[p](t ) = A(t )>p(t ),I1−α

s2− [p](s2) = p2,

respectively considered on the compact intervals [s1,b] and [a, s2], where w1, p2 ∈ Rn are fixedCauchy conditons. Then it holds that ⟨w1, p(s1)⟩Rn = ⟨w(s2), p2⟩Rn .

Remark 2.18. Using the basic Leibniz formula, Corollary 2.17 is trivial in the classical case α= 1.However, in the fractional setting 0 < α < 1, the Leibniz formula has no (simple) analogue. As aconsequence, the extension of this result to the fractional case is not trivial, and requires the use ofthe fractional Duhamel formula and of the duality theorem stated above. Note that Corollary 2.17is central for the introduction of an appropriate adjoint vector associated with variation vectorswhich are solutions to linear RL fractional Cauchy problems.

2.2.4 Some extensions in [B08]

Let me mention that, in my work [B08], the above results are established for more general RL andCaputo fractional Cauchy problems. Indeed:

22


• In the present section, we have only deal with RL and Caputo fractional Cauchy problemson a compact interval [a,b]. In the paper [B08], the problems are considered on general in-tervals (which can be unbounded, or semiopen for example). Such a possibility is importantfor forthcoming works in view of studying fractional control systems in infinite horizon (forstabilization or minimal time problems for example).

• In the present section, we have only deal with a single fractional order 0 < α≤ 1. In the pa-per [B08], the problems are considered with a fractional multiorder α= (α1, . . . ,αn) ∈ (0,1]n .This setting derives from fractional optimal control problems in which a classical Lagrangecost is involved and rewritten as a Mayer cost. However note that this generalization involvestechnical difficulties on estimations and thus requires precautions (see [B08, Lemma 5 andits proof] for example).

• In the present section, no state restriction was imposed on the solutions. In the paper [B08],the solutions to (cCP) are restricted to be with values in an open subset Ω ⊂ Rn . In thatcontext, the last sentence of Theorem 2.6 is not valid. Moreover, in the alternative of The-orem 2.6, the unboundedness of a nonglobal maximal solution x has to be replaced by thenext assertion: x is not with values in a compact subset K ⊂Ω. As explained in Remark 2.7,to the best of my knowledge, this result has never been addressed in the literature before thecontribution [B08].

Furthermore, the fractional Duhamel formula stated in Theorem 2.13 only deals with homoge-neous linear RL fractional Cauchy problems. Let me mention that fractional Duhamel formulashave been established in the work [B08] for nonhomogeneous linear RL (and also Caputo) frac-tional Cauchy problems. These results constitute the most original part of the paper [B08] since,to the best of my knowledge, fractional state-transition matrices were already discussed in the lit-erature [32, 52] but (only) in the case of a constant matrix A(·) ≡ A ∈ Rn,n (which corresponds tothe extension of the notion of exponential matrix to the fractional setting, involving in particularthe Mittag-Leffler function). I refer to [B08, Section 1.4] for more details.

2.3 Main results

This section, extracted from the work [B10] written in collaboration with Bergounioux, is devotedto the main result of the present chapter (see Theorem 2.30). In Section 2.3.1 we introduce a gen-eral Caputo fractional optimal control problem and we fix the terminology and the assumptions.Section 2.3.2 is dedicated to essential results which concern the sensitivity analysis of the Caputofractional state equation under perturbations of the control and of the initial condition. In Sec-tion 2.3.3 the Pontryagin maximum principle is stated (see Theorem 2.30), preceded by a Filippov-type theorem (see Theorem 2.28) which guarantees the existence of an optimal solution undersome appropriate compactness/convexity hypotheses. Finally some perspectives for future re-search are listed in Section 2.3.4

2.3.1 Caputo fractional optimal control problem: terminology and assumptions

Let a < b be two real numbers. Let m, n, ` ∈N∗ and let 0 <α≤ 1 and β≥α be fixed. We considerthe general Caputo fractional optimal control problem of Bolza form given by

minimize g (x(a), x(b))+ Iβa+[L(x,u, ·)](b),

subject to x ∈ cACαa+([a,b],Rn), u ∈ L∞([a,b],Rm),

cDαa+[x](t ) = f (x(t ),u(t ), t ), a.e. t ∈ [a,b],

ψ(x(a), x(b)) ∈ S,

u(t ) ∈ U, a.e. t ∈ [a,b].

(FP)

23


In Problem (FP), the variable x is called the state function (also called trajectory) and the variable uis called the control function. A couple (x,u) is said to be admissible for Problem (FP) if it satisfiesall its constraints. A couple (x∗,u∗) is said to be a solution to Problem (FP) if it is admissible and

minimizes the Bolza cost g (x(a), x(b))+ Iβa+[L(x,u, ·)](b) among all admissible couples (x,u).

Let us fix some terminology and assumptions:

• the dynamics f :Rn ×Rm ×[a,b] →Rn , that drives the Caputo fractional state equation givenby cDα

a+[x](t ) = f (x(t ),u(t ), t ), satisfies the following conditions:

– f is continuous, and is Lipschitz continuous with respect to its first two variables onevery compact subset.

– f is differentiable with respect to its first variable and ∇x f is continuous.

• the real function g :Rn ×Rn →R, that describes the Mayer cost g (x(a), x(b)), is of class C1.

• the real function L : Rn ×Rm × [a,b] → R, that describes the Lagrange cost Iβa+[L(x,u, ·)](b),satisfies the same conditions than the dynamics f .

• the set S ⊂ R` is a nonempty closed convex subset of R` and the function ψ : Rn ×Rn → R`,that describes the terminal state constraint ψ(x(a), x(b)) ∈ S, is of class C1.

• the set U ⊂ Rm , that describes the control constraint u(t ) ∈ U, is a nonempty closed subsetof Rm .

Remark 2.19. In the literature, the Lagrange cost Iβa+[L(x,u, ·)](b) is usually considered with β= 1(classical Lagrange cost) or β = α (fractional Lagrange cost). Note that one can always go back to

the case β=α since the Lagrange cost Iβa+[L(x,u, ·)](b) can be rewritten as Iαa+[Lβ(x,u, ·)](b) where

Lβ(x,u, t ) := Γ(α)

Γ(β)(b − t )β−αL(x,u, t ),

for all (x,u, t ) ∈Rn ×Rm ×[a,b]. Since β≥α, note that Lβ satisfies the same regularity assumptionsthan L.

2.3.2 Preliminaries: sensitivity analysis of Caputo fractional Cauchy problems

In this section we perform the sensitivity analysis of the Caputo fractional state equation in Prob-lem (FP), in order to get differentiability results on the trajectory x with respect to perturbations ofthe control u and of the initial condition x(a). For this purpose, for any pair (u, xa) ∈ L∞([a,b],Rm)×Rn , we introduce the nonlinear Caputo fractional Cauchy problem (cCPu,xa ) given by

cDαa+[x](t ) = f (x(t ),u(t ), t ), a.e. t ∈ [a,b],

x(a) = xa .(cCPu,xa )

Applying Theorem 2.6 we get the next proposition.

Proposition 2.20. Let (u, xa) ∈ L∞([a,b],Rm)×Rn . The Caputo fractional Cauchy problem (cCPu,xa )admits a unique maximal solution denoted by (x(·,u, xa), I (u, xa)). We have the alternative:

(i) either the maximal solution (x(·,u, xa), I (u, xa)) is global (in that case the pair (u, xa) is saidto be admissible for globality).

(ii) either the maximal solution (x(·,u, xa), I (u, xa)) is not global (in that case x(·,u, xa) is un-bounded on I (u, xa)).

24


In what follows we denote by AG the set of all pairs (u, xa) ∈ L∞([a,b],Rm)×Rn that are admissiblefor globality. The unboundedness of nonglobal maximal solutions allows to prove that the set AG

is L1 ×Rn-open, up to a uniform L∞-bound (see Proposition 2.21 which is extracted from [B10,Proposition A.1]). To this aim we use a proof by contradiction and the contradiction is raised byapplying a fractional Gronwall lemma which can be found in [33, 94] or [B10, Proposition B.1].

Proposition 2.21. Let (u, xa) ∈ AG . For every R ≥ ‖u‖L∞ , there exists ηR > 0 such that the L1 ×Rn-neighborhood, up to a uniform L∞-bound, of (u, xa) given by[

BL1 (u,ηR )∩BL∞(0L∞ ,R)]×BRn (xa ,ηR ),

is contained in AG .

We are now in a position to perform the sensitivity analysis of the Caputo fractional Cauchy prob-lem (cCPu,xa ) under perturbations of the control u and of the initial condition xa in the two nextparagraphs.

Needle-like variation of the control. In this paragraph we consider a pair (u, xa) ∈ AG . We lookfor differentiability of the state x(·,u, xa) with respect to specific perturbations (called needle-like variation in the literature) of the control u. For this purpose, we denote in the sequel byLeb[ f (x(·,u, xa),u, ·)] the set of all Lebesgue points in [a,b) of the essentially bounded map t 7→f (x(t ,u, xa),u(t ), t ). Recall that Leb[ f (x(·,u, xa),u, ·)] has a full Lebesgue measure equal to b −a.

Proposition 2.22. Let (u, xa) ∈ AG . For all (s, v) ∈ Leb[ f (x(·,u, xa),u, ·)]×Rm , there exists δ > 0such that (uδ, xa) ∈ AG for all 0 ≤ δ ≤ δ, where uδ ∈ L∞([a,b],Rm) is the needle-like variation of uassociated to (s, v) defined by

uδ(τ) :=

v if τ ∈ [s, s +δ),u(τ) if τ ∉ [s, s +δ),

for almost every τ ∈ [a,b]. Furthermore the quotient

x(·,uδ, xa)−x(·,u, xa)

δ,

uniformly converges on [s +ς,b], for any 0 < ς≤ b − s, to w(s,v)(·,u, xa) when δ→ 0, where the mapw(s,v)(·,u, xa) is the unique (global) solution to the linear left RL fractional Cauchy problem givenby

Dαs+[w](t ) =∇x f (x(t ,u, xa),u(t ), t )w(t ), a.e. t ∈ [s,b],

I1−αs+ [w](s) = f (x(s,u, xa), v, s)− f (x(s,u, xa),u(s), s).

The function w(s,v)(·,u, xa) is called the variation vector associated to (u, xa) and (s, v).

Remark 2.23. Proposition 2.22 can be found in [B10, Proposition 3.3]. Its proof is based on techni-cal estimations on the variation vector w(s,v)(·,u, xa) (using in particular its weighted continuity),on Taylor expansion formulas and on the application of a fractional Gronwall lemma which canbe found in [33, 94] or [B10, Proposition B.1].

Remark 2.24. Note that needle-like variations of the control in a Caputo fractional control sys-tem lead to variation vectors which are solutions to linear RL (and not Caputo) fractional Cauchyproblems. Moreover note that the RL initial condition involves the left RL integral I1−α

s+ with infe-rior bound s (and not a).

25


Perturbation of the initial condition. Still assuming that (u, xa) ∈AG , we now look for differen-tiability of the state x(·,u, xa) with respect to perturbations of the initial condition xa .

Proposition 2.25. Let (u, xa) ∈AG . For all y ∈ Rn , there exists δ> 0 such that (u, xa +δy) ∈AG forall 0 ≤ δ≤ δ. Furthermore the quotient

x(·,u, xa +δy)−x(·,u, xa)

δ,

uniformly converges on [a,b] to wy (·,u, xa) when δ→ 0, where the map wy (·,u, xa) is the uniquemaximal solution, which is moreover global, to the linear left Caputo fractional Cauchy problemgiven by

cDαa+[w](t ) =∇x f (x(t ,u, xa),u(t ), t )w(t ), a.e. t ∈ [a,b],

w(a) = y.

The function wy (·,u, xa) is called the variation vector associated to (u, xa) and y.

Remark 2.26. Proposition 2.25 can be found in [B10, Proposition 3.4]. Its proof is based on Taylorexpansion formulas and on the application of a fractional Gronwall lemma which can be foundin [33, 94] or [B10, Proposition B.1].

Remark 2.27. In contrary to needle-like variations of the control (see Remark 2.24), note that per-turbations of the initial condition in a Caputo fractional control system lead to variation vectorswhich are solutions to linear Caputo (and not RL) fractional Cauchy problems. Nevertheless, inorder to define a common adjoint vector to all these variation vectors, we can go back to (nonho-mogeneous) linear RL fractional Cauchy problems by introducing Wy (·,u, xa) := wy (·,u, xa)− y . Irefer to [B10, Section 3.3.3] for more details.

2.3.3 Filippov existence theorem and Pontryagin maximum principle

The main concern of this chapter is to state a strong version of the Pontryagin maximum prin-ciple for Problem (FP). Nevertheless, with Bergounioux in [B10], we were eager to prove in a firstplace that the existence of optimal controls, under some standard assumptions, is preserved at thefractional level. Thus, in this section, we first provide a result stating the existence of at least onesolution to Problem (FP) under some appropriate compactness/convexity assumptions. Preciselywe follow the standard Filippov approach (see, e.g., [22, 25, 38, 64]). For this purpose we introducethe usual set of augmented velocities defined by

( f ,L+)(x,U, t ) :=

( f (x,u, t ),L(x,u, t )+γ) | u ∈ U, γ≥ 0⊂Rn+1,

for all (x, t ) ∈ Rn × [a,b]. Moreover let E ⊂ C([a,b],Rn) stand for the set of all trajectories x ∈cACα

a+([a,b],Rn) that can be associated to a control u ∈ L∞([a,b],Rm) such that the couple (x,u) isadmissible for Problem (FP). Obviously, if E is empty, then Problem (FP) has no solution. Other-wise, the following existence result holds true.

Theorem 2.28 (Filippov existence theorem). Assume that E is nonempty and is bounded in thespace C([a,b],Rm), U is compact and ( f ,L+)(x,U, t ) is convex for all (x, t ) ∈ Rn × [a,b]. Then Prob-lem (FP) has at least one solution.

Remark 2.29. Theorem 2.28 can be found in [B10, Theorem 3.1]. Its proof is a simple adaptationof the classical proof to the fractional setting and no major difficulty has been encountered. Notethat the regularity assumptions on f , L, g and ψ introduced in Section 2.3.1 can be weakenedfor Theorem 2.28. Indeed only the continuity of f , L, g and ψ and Lipschitz continuity in thetwo first variables on compact subsets (for f and L) are required. Similarly the convexity of S is asuperfluous assumption that can be removed.

Before stating our main result (Theorem 2.30 below), we first need to recall two basic notions:

26


• the normal cone to S at a point x ∈ S is defined by

NS[x] := z ∈R` | ∀x ′ ∈ S, ⟨z, x ′−x⟩R` ≤ 0.

• the mapψ :Rn×Rn →R j is said to be submersive at a point (xa , xb) ∈Rn×Rn if its differentialat this point is surjective.

We are now in a position to formulate the main result of the present chapter.

Theorem 2.30 (Pontryagin maximum principle). If (x∗,u∗) ∈ cACαa+([a,b],Rn)×L∞([a,b],Rm) is a

solution to Problem (FP), then there exists a nontrivial couple (p,λ), whereλ≥ 0 and p ∈ ACαb−([a,b],Rn)

(called adjoint or costate vector) such that:

(i) Fractional Hamiltonian system (or extremal equations): it holds that

cDαa+[x∗](t ) =∇p H(x∗(t ),u∗(t ), p(t ),λ, t ),

Dαb−[p](t ) =∇x H(x∗(t ),u∗(t ), p(t ),λ, t ),

for almost every t ∈ [a,b], where the Hamiltonian H :Rn ×Rm ×Rn ×R× [a,b) →R associatedto Problem (FP) is defined by

H(x,u, p,λ, t ) := ⟨p, f (x,u, t )⟩Rn −λ (b − t )β−1

Γ(β)L(x,u, t ),

for all (x,u, p,λ, t ) ∈Rn ×Rm ×Rn ×R× [a,b).

(ii) Hamiltonian maximization condition: it holds that

u∗(t ) ∈ argmaxv∈U

H(x∗(t ), v, p(t ),λ, t ),


(iii) Transversality conditions on the adjoint vector: ifψ is submersive at (x∗(a), x∗(b)), then thenontrivial couple (p,λ) can be selected to satisfy(

I1−αb− [p](a)

−I1−αb− [p](b)

)=λ∇g (x∗(a), x∗(b))+∇ψ(x∗(a), x∗(b))>×ξ,

where ξ ∈ NS[ψ(x∗(a), x∗(b))].

Remark 2.31. Theorem 2.30 and its proof can be found in [B10, Theorem 3.2]. Since Problem (FP)is an optimization problem and we were looking for necessary optimality conditions, the proofof Theorem 2.30 is strongly based on the sensitivity analysis performed in Section 2.3.2. On theother hand, in order to take into account of the terminal state constraint ψ(x(a), x(b)) ∈ S underperturbations of the control and of the initial condition, we introduced a penalized functional andwe derived the necessary optimality conditions by invoking the Ekeland variational principle [36].In particular, in order to define the penalized functional on a complete metric set, the closednessof U is required.

Remark 2.32. The nontrivial couple (p,λ) in Theorem 2.30, which is a Lagrange multiplier, is de-fined up to a positive multiplicative scalar. Defining as usual an extremal as a quadruple (x,u, p,λ)solution to the extremal equations, an extremal is said to be normal whenever λ 6= 0 and abnor-mal whenever λ= 0. In the normal case λ 6= 0, it is usual to normalize the Lagrange multiplier sothat λ= 1.

Remark 2.33. Theorem 2.30 encompasses the classical Pontryagin maximum principle when con-sidering α=β= 1.

27


Remark 2.34. Note that Theorem 2.30 has been successfully applied in [B10, Section 4] in or-der to solve analytically two examples (including endpoint and control values constraints). Moreprecisely, a fractional version of the classical parking problem (or double integrator problem) withfixed initial and final conditions has been solved, as well as a fractional Zermelo problem involvingcontrol constraints.

Remark 2.35. In Problem (FP), the state fractional equation cDαa+[x](t ) = f (x(t ),u(t ), t ) involves

the left Caputo fractional operator cDαa+, while the adjoint equation in Theorem 2.30 given by

Dαb−[p](t ) = ∇x H(x∗(t ),u∗(t ), p(t ),λ, t ) involves the right RL fractional derivative Dα

b−. Accord-ingly, Problem (FP) depends on the terminal conditions x(a) and x(b), while the transversalityconditions on the adjoint vector in Theorem 2.30 involve I1−α

b− [p](a) and I1−αb− [p](b). Finally, note

that the adjoint vector p ∈ ACαb−([a,b],Rn) may admit a singularity at t = b (see the right counter-

part of Remark 1.8 and see [B10, Section 4] for two solved examples).

Remark 2.36. Let us describe some typical situations of terminal state constraintψ(x(a), x(b)) ∈ Sin Problem (FP), and the corresponding transversality conditions in Theorem 2.30:

• If the terminal points are fixed in Problem (FP), one may consider ` = 2n, ψ as the identityfunction and S = xa× xb where xa , xb ∈Rn are the fixed terminal points. In that case, thetransversality conditions in Theorem 2.30 do not provide any additional information.

• If the initial point is fixed and the final point is free in Problem (FP), one may consider `= 2n,ψ as the identity function and S = xa×Rn where xa ∈ Rn is the fixed initial point. In thatcase, the nontriviality of the couple (p,λ), the linearity of the adjoint equation Dα

b−[p](t ) =∇x H(x∗(t ),u∗(t ), p(t ),λ, t ) and the transversality conditions on the adjoint vector imply thatλ 6= 0 (which we normalize to λ= 1, see Remark 2.32) and I1−α

b− [p](b) =−∇2g (x∗(a), x∗(b)).

• If the initial point is fixed and the final point is subject to inequality constraintsΨi (x(b)) ≤ 0for i = 1, . . . , j , one may consider `= n+ j , ψ :Rn ×Rn →Rn+ j , ψ(xa , xb) := (xa ,Ψ(xb)) whereΨ = (Ψ1, . . . ,Ψ j ) : Rn → R j and S = xa× (R−) j . If Ψ is of class C1 and is submersive at anypoint xb ∈Ψ−1((R−) j ), then the transversality conditions in Theorem 2.30 can be written as

−I1−αb− [p](b) =λ∇2g (x∗(a), x∗(b))+

j∑i=1

λi∇Ψi (x∗(b)),

for some λi ≥ 0, i = 1, . . . , j .

• If there is no Mayer cost (that is, g ≡ 0) and the periodic condition x(a) = x(b) is imposed inProblem (FP), one may consider ` = n, ψ : Rn ×Rn → Rn , ψ(xa , xb) := xb − xa and S = 0Rn .In that case, the transversality conditions in Theorem 2.30 yield that I1−α

b− [p](a) = I1−αb− [p](b).

We point out that, in all examples above, the function ψ is indeed a submersion.

Remark 2.37. In this remark we assume that the Hamiltonian H introduced in Theorem 2.30 isdifferentiable with respect to its second variable (for example, if f and L are so). In that situation,if U is convex, then the Hamiltonian maximization condition in Theorem 2.30 implies the (weaker)nonnegative Hamiltonian gradient condition given by

⟨∇u H(x∗(t ),u∗(t ), p(t ),λ, t ), v −u∗(t )⟩Rm ≤ 0,

for all v ∈ U and for almost every t ∈ [a,b]. Note that this condition can be rewritten as

∇u H(x∗(t ),u∗(t ), p(t ),λ, t ) ∈ NU[u∗(t )],

for almost every t ∈ [a,b]. In particular, if U = Rm (that is, no control constraint in Problem (FP)),then the Hamiltonian maximization condition in Theorem 2.30 implies the (weaker) null Hamil-tonian gradient condition given by

∇u H(x∗(t ),u∗(t ), p(t ),λ, t ) = 0Rm ,


28


Remark 2.38. If β = 1 in Problem (FP), then we recover in Theorem 2.30 the usual Hamiltonianfunction defined by H(x,u, p,λ, t ) := ⟨p, f (x,u, t )⟩Rn −λL(x,u, t ). On the other hand, if β 6= 1, thenthe Hamiltonian is not standard any longer since it is given by H(x,u, p,λ, t ) := ⟨p, f (x,u, t )⟩Rn −λ (b−t )β−1

Γ(β) L(x,u, t ). This phenomenon is due to the nonlocality of the fractional operator Iβa+, but itis natural since the fractional Lagrange cost can be rewritten as

Iβa+[L(x,u, ·)](b) = I1a+

[(b −·)β−1

Γ(β)L(x,u, ·)

](b).

In particular, if β 6= 1, the Hamiltonian considered in Theorem 2.30 may be not autonomous, evenif f and L are so.

2.3.4 Perspectives

Consider the framework of Theorem 2.30 in the classical case α = β = 1 and recall the notion ofmaximized Hamiltonian H : [a,b] →R defined by

H (t ) := H(x∗(t ),u∗(t ), p(t ),λ, t ),

for almost every t ∈ [a,b]. If H is differentiable with respect to t with ∇t H continuous (for exampleif f and L are so), it is well known that H is equal almost everywhere on [a,b] to an absolutelycontinuous function (denoted similarly) which satisfies

H (t ) =∇t H(x∗(t ),u∗(t ), p(t ),λ, t ),

for almost every t ∈ [a,b]. This property is known as the Hamiltonian (absolute) continuity and weroughly say that the total derivative of the Hamiltonian is equal to its partial derivative. In particu-lar, if the problem is autonomous, then H is constant. We refer for instance to [37, Theorem 2.6.3]for details in the classical theory α = β = 1. This property provides an additional necessary opti-mality condition and is particularly interesting in order to deal with optimal control problems withfree final time (which encompass minimal time problems for example). Indeed, it is well knownthat a change of time variable allows to convert a free final time problem into an autonomousfixed final time problem. Then, from the constancy of the corresponding maximized Hamilto-nian, combined with a parameterized version of the classical Pontryagin maximum principle, theusual transversality condition on the optimal free final time is obtained. We refer for instanceto [51, Chapter 14] for details in the classical theory α = β = 1. To the best of my knowledge, noHamiltonian continuity in the fractional case 0 <α< 1 and β≥α has been announced, proved orrefuted in the literature. This is due in particular to the lack of (simple) formula for the fractionalderivative of a composition. One of my priority for future works in that field is to face this chal-lenging open question. In particular the final objective in my mind is to establish a version of thePontryagin maximum principle that handles Caputo fractional optimal control problems with freefinal time. Let me point out that some earlier works like [18, 70, 81, 90] already deal with this topic.

I conclude this paragraph by emphasizing that, similarly to the classical case α = β = 1, the Pon-tryagin maximum principle stated in Theorem 2.30 allows to solve analytically (only) a few num-ber of basic Caputo fractional optimal control problems (see [B10, Section 4] for two examples).In the classical case α = β = 1, recall that the Pontryagin maximum principle induces an indirectmethod in order to solve them numerically, by using a so-called shooting method. Precisely it re-duces classical optimal control problems to boundary value problems that can be solved by solveralgorithms, such as Newton methods for example. We refer for instance to [21, Section 3.3] for de-tails on indirect numerical methods in the classical case α= β= 1. One of my objective for futureworks in that field is to extend these methods to the fractional setting. However, due to the asym-metry between the fractional operators cDα

a+ and Dαb− involved in Theorem 2.30 (see Remark 2.35),

the numerical resolution of the fractional Hamiltonian system would not be trivial. Hence one ofmy goals is to overcome this first difficulty in order to solve Caputo fractional optimal controlproblems with a numerical indirect method based on Theorem 2.30. I refer to [3, 4, 12, 53, 81] forprevious numerical studies on various fractional optimal control problems.

29


2.4 Applications to fractional calculus of variations

This section is extracted from the collaboration [B11] jointly with Ferreira. This work was primarilymotivated by our findings in the literature of some flaws within the proof of the second-orderLegendre necessary optimality condition for fractional calculus of variations problems. We wereeager to elaborate together a correct proof, but it turns out that the standard proof used in theclassical theory cannot be extended to the fractional case (see Section 2.4.2 for a discussion on thattechnical point). On the other hand, the Pontryagin maximum principle stated in Theorem 2.30turns out to be suitable in order to provide a complete and correct proof.

2.4.1 Legendre condition for a Caputo fractional calculus of variations problem

In this section we preserve the notations and assumptions introduced in Section 2.3.1 and wefocus on the minimization problem of the general fractional Bolza functional given by

B : K ⊂ cACα,∞a+ ([a,b],Rn) −→ R

x 7−→ B(x) := g (x(a), x(b))+ Iβa+[L(x, cDα

a+[x], ·)] (b),

where 0 <α≤ 1 and β≥α, and where

cACα,∞a+ ([a,b],Rn) := x ∈ cACα

a+([a,b],Rn) | cDαa+[x] ∈ L∞([a,b],Rn),

K := x ∈ cACα,∞a+ ([a,b],Rn) |ψ(x(a), x(b)) ∈ S.

The minimization problem of the above fractional Bolza functional B exactly coincides with Prob-lem (FP) in Section 2.3, by considering m = n, f (x,u, t ) := u and U = Rn . Thus Theorem 2.30 canbe applied. Nevertheless, in order to invoke Remark 2.37 and derive the second-order Legendrenecessary optimality condition, we add the following assumption: the real function L is of class C1

in its two first variables and ∇2L is differentiable with respect to its second variable. We are now ina position to state the main result of this section.

Theorem 2.39. If x∗ ∈ K is a minimizer of B and ψ is submersive at (x∗(a), x∗(b)), then:

(i) Euler–Lagrange equation: the map (b−·)β−1

Γ(β) ∇2L(x∗, cDαa+[x∗], ·) ∈ ACα

b+([a,b],Rn) with

(b − t )β−1

Γ(β)∇1L(x∗(t ), cDα

a+[x∗](t ), t )+Dαb−

[(b −·)β−1

Γ(β)∇2L(x∗, cDα

a+[x∗], ·)]

(t ) = 0Rn ,

for almost every t ∈ [a,b];

(ii) Transversality condition: it holds thatI1−α

b−

[(b −·)β−1


a+[x∗], ·)]

(a)

−I1−αb−

[(b −·)β−1


a+[x∗], ·)]

(b)

=∇g (x∗(a), x∗(b))+∇ψ(x∗(a), x∗(b))>×ξ,

where ξ ∈ NS[ψ(x∗(a), x∗(b))];

(iii) Legendre condition: the matrix (b−t )β−1

Γ(β) ∇222L(x∗(t ), cDα

a+[x∗](t ), t ) ∈Rn×n is positive semidef-inite for almost every t ∈ [a,b].

Remark 2.40. Theorem 2.39 can be found in [B11, Theorem 3.2]. In there, its proof is quite long,because we reconsider the sensitivity analysis of fractional control systems and Ekeland varia-tional principle. Actually Theorem 2.39 can be proved as a direct corollary of Theorem 2.30, where

the adjoint vector p plays the role of (b−·)β−1

Γ(β) ∇2L(x∗, cDαa+[x∗], ·) and the Euler–Lagrange equation

and the Legendre condition follow directly from the Hamiltonian maximization condition.

30


2.4.2 Obstructions for the usual argument in the fractional setting

In this section, extracted from [B11, Section 3.3], we discuss the obstructions when trying to extendthe standard proof of the classical Legendre condition to the fractional setting. Roughly speaking,the standard approach in order to derive the Legendre condition in the classical case (α = 1) isbased on the existence of nontrivial variations w ∈ cAC1,∞

a+ ([a,b],Rn) such that:

(i) w and w are compactly supported in a small interval [τ,τ+ε] ⊂ (a,b);

(ii) w “dominates" w on [τ,τ+ε] in a sense to precise (see the discussion in [64, p.60] for details).

In particular, since the variation w is compactly supported, it holds that w(a) = w(b) = 0Rn whichdoes not perturb the terminal constraint ψ(x(a), x(b)) ∈ S. Several different families of variationshave been considered in the literature (see, e.g., [41, proof of Lemma p.103], [91, proof of Theo-rem 10.3.1] or [97, proof of Theorem 1.3]).

In a first attempt to derive a fractional version of the Legendre condition (with fixed initial and finalpoints), the authors of [60] followed the same classical strategy as above. Unfortunately, one caneasily check that the variation w considered in [60, Equality (11)] does not satisfy all of the aboveproperties. Precisely, while w is indeed compactly supported, cDα

a+[w] is clearly not (in contraryto what is claimed in [60, Equality (12)]). Surprisingly the same mistake has been disseminated ina series of papers (see [6, 7, 9]). This discovery was the starting point of the collaboration [B11]with Ferreira. Actually, due to the very well known memory skill of the fractional derivative cDα

a+,we conjectured that Property (i) could not be satisfied by any nontrivial variation w in the purelyfractional case 0 <α< 1. This is exactly the content of the following novel result, which condemnsfor good the exact adaptation of the classical approach for the Legendre condition to the purelyfractional case with terminal constraints.

Proposition 2.41. Let 0 < α < 1 and x ∈ cACαa+([a,b],Rn). If there exist two real numbers a ≤ c <

d ≤ b such that x(t ) = cDαa+[x](t ) = 0Rn for almost every t ∈ [c,d ], then x(t ) = 0Rn for all t ∈ [a,d ].

Remark 2.42. Proposition 2.41 can be found in [B11, Proposition 3.3]. Its proof is based on theisolated zeros theorem in power series theory. I would like to emphasize that Proposition 2.41 isan intrinsic result of fractional calculus, in the sense that it is clearly not true for α= 1.

Remark 2.43. Proposition 2.41 should be of independent interest for other researchers, in partic-ular in the field of fractional differential equations. Indeed it is well known in classical differentialequations that two different initial conditions yield two different solutions that cannot intersecteach other. The preservation (or not) of this fundamental property at the fractional level was dis-cussed in [27, 33]. In particular the authors of [27] prove that this property is preserved in theone-dimensional setting, while it does not in the higher-dimensional case (a counter-exampleis provided). Note that Proposition 2.41 allows to contribute to this discussion. Precisely we candeduce that, even in the higher-dimensional case, two different initial conditions of a Caputo frac-tional differential equation of order 0 <α< 1 yield two different solutions that cannot coincide onan interval with a nonempty interior.

2.4.3 Bonus: discussion on some non-existence results

During my PhD thesis, and as developed in Chapter 1, I was particularly interested in the existence(or not) of minimizers to fractional functionals. Unexpectedly, it turns out that Theorem 2.39 al-lows us to provide some non-existence results for fractional calculus of variations problems. Pre-cisely, consider the framework of Theorem 2.39 with 0 <α< 1 ≤β (in particular for β= 1) and withno terminal constraint. In that context, the transversality condition implies that∇2g (x∗(a), x∗(b)) =0Rn from the right counterpart of Proposition 1.4. As a consequence, if one considers a Mayerfunction g such that ∇2g (xa , xb) 6= 0Rn for all (xa , xb) ∈ Rn ×Rn , then we can directly concludefrom Theorem 2.39 that the corresponding fractional Bolza functional B has no minimizer in

cACα,∞a+ ([a,b],Rn). This specific feature of the fractional setting is illustrated in Example 2.44 below.

31


Example 2.44. Consider the one-dimensional (n = 1) fractional Bolza functional given by

B(x) := x(1)+ I10+

[1

2

(x2 + cDα

a+[x]2)]

(1),

for all x ∈ cACα,∞0+ ([0,1],R), where 0 < α ≤ 1 (and β = 1). In the classical case α = 1, it can be

proved that B admits a minimizer in cAC1,∞0+ ([0,1],R) given by x(t ) = 4e

1−e2 cosh(t ) for all t ∈ [0,1].Indeed one has to solve the Euler–Lagrange equation together with the transversality conditionsin order to determinate the above candidate, and then prove that this candidate is optimal fromthe convexity of the Lagrange cost (following for example the strategy proposed in [91, p.258]).In contrast we can directly conclude from the above discussion that B has no minimizer in theset cACα,∞

0+ ([0,1],R) in the purely fractional case 0 <α< 1.

It appears that a more natural fractional framework which allows to avoid the pitfall discussed inthis section is to take β = α. Note that this consideration is not taken into account in most ofliterature on fractional calculus of variations, including my first works [B03, B04, B05, B06, B07]presented in Chapter 1.

32

Part 1: References of Loïc Bourdin(chronological order)

[B01] L. Bourdin. Variational integrators of fractional Lagrangian systems in the framework ofdiscrete embeddings. In proceedings of the Eleventh International Conference Zaragoza-Pau on Applied Mathematics and Statistics, volume 37 of Monogr. Mat. García Galdeano,pages 69–78. Prensas Univ. Zaragoza, Zaragoza, 2012. 3

[B02] L. Bourdin, J. Cresson, I. Greff, and P. Inizan. Variational integrator for fractional Euler–Lagrange equations. Appl. Numer. Math., 71:14–23, 2013. 3

[B03] L. Bourdin. Existence of a weak solution for fractional Euler–Lagrange equations. J. Math.Anal. Appl., 399(1):239–251, 2013. 1, 4, 12, 14, 32

[B04] L. Bourdin, T. Odzijewicz, and D. Torres. Existence of minimizers for fractional variationalproblems containing Caputo derivatives. Adv. Dyn. Syst. Appl., 8(1):3 – 12, 2013. 1, 4, 12, 14,32

[B05] L. Bourdin. Contributions au calcul des variations et au principe du maximum de Pontryaginen calculs time scale et fractionnaire. PhD thesis, University of Pau (France), 2013. 1, 4, 8, 9,10, 12, 14, 16, 17, 32

[B06] L. Bourdin, T. Odzijewicz, and D. Torres. Existence of minimizers for generalized Lagrangianfunctionals and a necessary optimality condition – Application to fractional variationalproblems. Differential Integral Equations, 27(7-8):743–766, 2014. 1, 4, 12, 14, 32

[B07] L. Bourdin and D. Idczak. A fractional fundamental lemma and a fractional integration byparts formula – Applications to critical points of Bolza functionals and to linear boundaryvalue problems. Adv. Differential Equations, 20(3-4):213–232, 2015. 1, 4, 6, 12, 13, 14, 32

[B08] L. Bourdin. Cauchy–Lipschitz theory for fractional multi-order dynamics: state-transitionmatrices, Duhamel formulas and duality theorems. Differential Integral Equations, 31(7-8):559–594, 2018. 15, 18, 19, 20, 21, 22, 23

[B09] L. Bourdin. Weighted Hölder continuity of Riemann–Liouville fractional integrals – Applica-tion to regularity of solutions to fractional Cauchy problems with Carathéodory dynamics.Fract. Calc. Appl. Anal., 22(3):722–749, 2019. 15, 18, 19, 21

[B10] M. Bergounioux and L. Bourdin. Pontryagin maximum principle for general Caputo frac-tional optimal control problems with Bolza cost and terminal constraints. ESAIM ControlOptim. Calc. Var. (to appear), 2019. 6, 15, 17, 18, 23, 25, 26, 27, 28, 29

[B11] L. Bourdin and R. Ferreira. First and second-order necessary optimality conditions for Bolzafunctionals with Caputo fractional derivatives and general mixed initial/final constraints.Submitted, 2019. 15, 18, 30, 31

PART 1: REFERENCES OF LOÏC BOURDIN (CHRONOLOGICAL ORDER)

Part 1: General bibliography(alphabetical order)

[1] A. Agrachev and Y. Sachkov. Control theory from the geometric viewpoint, volume 87 of En-cyclopaedia of Mathematical Sciences. Springer-Verlag, Berlin, 2004. Control Theory andOptimization, II. 16

[2] O. Agrawal. Formulation of Euler–Lagrange equations for fractional variational problems. J.Math. Anal. Appl., 272(1):368–379, 2002. 3

[3] O. Agrawal. A general formulation and solution scheme for fractional optimal control prob-lems. Nonlinear Dynam., 38(1-4):323–337, 2004. 16, 29

[4] O. Agrawal, D. Baleanu, and O. Defterli. A central difference numerical scheme for fractionaloptimal control problems. J. Vib. Control, 15(4):583–597, 2009. 16, 29

[5] O. Agrawal and P. Kumar. An approximate method for numerical solution of fractional dif-ferential equations. Signal Processing, 86(10):2602 – 2610, 2006. Special Section: FractionalCalculus Applications in Signals and Systems. 3

[6] R. Almeida. Variational problems involving a Caputo-type fractional derivative. J. Optim.Theory Appl., 174(1):276–294, 2017. 31

[7] R. Almeida. Optimality conditions for fractional variational problems with free terminaltime. Discrete Contin. Dyn. Syst. Ser. S, 11(1):1–19, 2018. 31

[8] R. Almeida, A. Malinowska, and D. Torres. A fractional calculus of variations for multipleintegrals with application to vibrating string. J. Math. Phys., 51(3):033503, 12, 2010. 2

[9] R. Almeida and M. Morgado. The Euler–Lagrange and Legendre equations for functionalsinvolving distributed-order fractional derivatives. Appl. Math. Comput., 331:394–403, 2018.31

[10] R. Almeida, S. Pooseh, and D. Torres. Fractional variational problems depending on indefi-nite integrals. Nonlinear Anal., 75(3):1009–1025, 2012. 3

[11] R. Almeida and D. Torres. Necessary and sufficient conditions for the fractional calculusof variations with Caputo derivatives. Commun. Nonlinear Sci. Numer. Simul., 16(3):1490–1500, 2011. 3

[12] R. Almeida and D. Torres. A discrete method to solve fractional optimal control problems.Nonlinear Dynam., 80(4):1811–1816, 2015. 29

[13] R. Bagley and P. Torvik. A theoretical basis for the application of fractional calculus in vis-coelasticity. Journal of Rheology, 27:201–210, 1983. 2

[14] R. Bagley and P. Torvik. On the fractional calculus model of viscoelasticity behavior. Journalof Rheology, 30:133–155, 1986. 2

PART 1: GENERAL BIBLIOGRAPHY (ALPHABETICAL ORDER)

[15] F. Bahrami, H. Fazli, and A. Akbarfam. A new approach on fractional variational problemsand Euler–Lagrange equations. Communications in Nonlinear Science and Numerical Sim-ulation, 23, 06 2015. 4

[16] D. Baleanu and S. Muslih. Lagrangian formulation of classical fields within Riemann–Liouville fractional derivatives. Phys. Scripta, 72(2-3):119–121, 2005. 3

[17] D. Baleanu, I. Petras, J. Asad, and M. Velasco. Fractional Pais-Uhlenbeck oscillator. Int. J.Theor. Phys., 51, 2012. 3

[18] R. K. Biswas and S. Sen. Free final time fractional optimal control problems. J. Franklin Inst.,351(2):941–951, 2014. 29

[19] T. Blaszczyk and M. Ciesielski. Fractional Euler–Lagrange equations - Numerical solutionsand applications of reflection operator. Scientific Research of the Institute of Mathematicsand Computer Science, 2010. 3

[20] V. Boltyanski, R. Gamkrelidze, E. Mishchenko, and L. Pontryagin. The maximum principlein the theory of optimal processes of control. (With discussion). In Automatic and RemoteControl (Proc. First Internat. Congr. Internat. Fed. Automat. Control, Moscow, 1960), Vol. I,pages 454–459. Butterworths, London, 1961. 16

[21] J. F. Bonnans. The shooting approach to optimal control problems. IFAC Proceedings Vol-umes, 46(11):281 – 292, 2013. 11th IFAC Workshop on Adaptation and Learning in Controland Signal Processing. 29

[22] A. Bressan and B. Piccoli. Introduction to the mathematical theory of control, volume 2 ofAIMS Series on Applied Mathematics. American Institute of Mathematical Sciences (AIMS),Springfield, MO, 2007. 16, 26, 34, 80, 82, 83, 86

[23] J. Bryson and Y. Ho. Applied optimal control. Hemisphere Publishing Corp. Washington, D.C., 1975. Optimization, estimation, and control, Revised printing. 16

[24] F. Bullo and A. Lewis. Geometric control of mechanical systems, volume 49 of Texts in AppliedMathematics. Springer-Verlag, New York, 2005. Modeling, analysis, and design for simplemechanical control systems. 16

[25] L. Cesari. Optimization – Theory and applications, volume 17 of Applications of Mathematics(New York). Springer-Verlag, New York, 1983. Problems with ordinary differential equations.26

[26] F. Comte. Opérateurs fractionnaires en économétrie et en finance. Prépublication MAP5,2001. 2

[27] N. Cong and H. Tuan. Generation of nonlocal fractional dynamical systems by fractionaldifferential equations. J. Integral Equations Appl., 29(4):585–608, 2017. 31

[28] J. Cresson. Fractional embedding of differential operators and Lagrangian systems. J. Math.Phys., 48(3):033504, 34, 2007. 3

[29] J. Cresson, I. Greff, and P. Inizan. Lagrangian for the convection-diffusion equation. Mathe-matical Methods in the Applied Sciences, 2011. 3, 14

[30] J. Cresson and P. Inizan. Variational formulations of differential equations and asymmetricfractional embedding. J. Math. Anal. Appl., 385(2):975–997, 2012. 3, 14

[31] B. Dacorogna. Direct methods in the calculus of variations, volume 78 of Applied Mathemat-ical Sciences. Springer, New York, second edition, 2008. 4


[32] S. Das. State trajectory control and control energy for fractional order multivariate dynamicsystem. Tutorial for Department of Electrical Engineering V.N.I.T-Nagpur, 02 2013. 23

[33] K. Diethelm and N. Ford. Analysis of fractional differential equations. J. Math. Anal. Appl.,265(2):229–248, 2002. 20, 25, 26, 31

[34] K. Diethelm, N. Ford, and A. Freed. A predictor-corrector approach for the numerical solu-tion of fractional differential equations. Nonlinear Dynam., 29(1-4):3–22, 2002. Fractionalorder calculus and its applications. 3

[35] K. Diethelm, N. Ford, A. Freed, and Y. Luchko. Algorithms for the fractional calculus: aselection of numerical methods. Comput. Methods Appl. Mech. Engrg., 194(6-8):743–773,2005. 3

[36] I. Ekeland. On the variational principle. J. Math. Anal. Appl., 47:324–353, 1974. 16, 17, 27,43, 48, 65

[37] H. O. Fattorini. Infinite-dimensional optimization and control theory, volume 62 of Encyclo-pedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 1999.29, 45, 62

[38] A. F. Filippov. On some questions in the theory of optimal regulation: existence of a solutionof the problem of optimal regulation in the class of bounded measurable functions. VestnikMoskov. Univ. Ser. Mat. Meh. Astr. Fiz. Him., 1959(2):25–32, 1959. 16, 26, 88, 90

[39] G. Frederico and D. Torres. Fractional optimal control in the sense of Caputo and the frac-tional Noether’s theorem. Int. Math. Forum, 3(9-12):479–493, 2008. 16

[40] R. Gamkrelidze. Discovery of the maximum principle. In Mathematical events of the twenti-eth century, pages 85–99. Springer, Berlin, 2006. 16

[41] I. M. Gelfand and S. Fomin. Calculus of variations. Revised English edition translated andedited by Richard A. Silverman. Prentice-Hall, Inc., Englewood Cliffs, N.J., 1963. 31

[42] E. Gerolymatou, I. Vardoulakis, and R. Hilfer. Modelling infiltration by means of a nonlinearfractional diffusion model. J. Phys. D: Appl. Phys., 39:4104–4110, 2006. 2

[43] W. Glöckle and T. Nonnenmacher. A fractional calculus approach to self-similar proteindynamics. Biophysical Journal, 68:46–53, 1995. 2

[44] T. Guo. The necessary conditions of fractional optimal control in the sense of Caputo. J.Optim. Theory Appl., 156(1):115–126, 2013. 16

[45] E. Hairer, C. Lubich, and G. Wanner. Geometric numerical integration, volume 31 ofSpringer Series in Computational Mathematics. Springer-Verlag, Berlin, second edition,2006. Structure-preserving algorithms for ordinary differential equations. 3

[46] T. Hélie and D. Matignon. Diffusive representations for the analysis and simulation of flaredacoustic pipes with visco-thermal losses. Math. Models Methods Appl. Sci., 16(4):503–536,2006. 2

[47] H. V. Helmholtz. Ueber die physikalische Bedeutung des Prinicips der kleinsten Wirkung.(Fortsetzung). J. Reine Angew. Math., 100:213–222, 1887. 2

[48] M. Hestenes. Calculus of variations and optimal control theory. Robert E. Krieger PublishingCo. Inc., Huntington, N.Y., 1980. Corrected reprint of the 1966 original. 16, 34

[49] R. Hilfer. Applications of fractional calculus in physics. World Scientific, River Edge, NewJersey, 2000. 2


[50] R. Hilfer. Fractional calculus and regular variation in thermodynamics. In Applications offractional calculus in physics, pages 429–463. World Sci. Publ., River Edge, NJ, 2000. 2

[51] D. G. Hull. Optimal control theory for applications. Mechanical Engineering Series. Springer-Verlag, New York, 2003. 29

[52] D. Idczak and R. Kamocki. On the existence and uniqueness and formula for the solution ofR-L fractional Cauchy problem in Rn . Fract. Calc. Appl. Anal., 14(4):538–553, 2011. 21, 23

[53] Z. Jelicic and N. Petrovacki. Optimality conditions and a solution scheme for fractional op-timal control problems. Struct. Multidiscip. Optim., 38(6):571–581, 2009. 16, 29

[54] F. Jiao and Y. Zhou. Existence of solutions for a class of fractional boundary value problemsvia critical point theory. Comput. Math. Appl., 62(3):1181–1199, 2011. 4

[55] F. Jiménez and S. Ober-Blöbaum. A fractional variational approach for modelling dissipativemechanical systems: continuous and discrete settings. IFAC-PapersOnLine, 51, 02 2018. 14

[56] V. Jurdjevic. Geometric control theory, volume 52 of Cambridge Studies in Advanced Mathe-matics. Cambridge University Press, Cambridge, 1997. 16

[57] R. Kamocki. Pontryagin maximum principle for fractional ordinary optimal control prob-lems. Math. Methods Appl. Sci., 37(11):1668–1686, 2014. 17

[58] A. Kilbas, H. Srivastava, and J. Trujillo. Theory and applications of fractional differentialequations, volume 204 of North-Holland Mathematics Studies. Elsevier Science B.V., Ams-terdam, 2006. 3, 4, 5, 6, 7

[59] M. Klimek. Existence - uniqueness result for a certain equation of motion in fractional me-chanics. Bulletin of the polish accademy of sciences, 58(4):73–78, 2010. 4

[60] M. Lazo and D. Torres. The Legendre condition of the fractional calculus of variations. Op-timization, 63(8):1157–1165, 2014. 31

[61] E. Lee and L. Markus. Foundations of optimal control theory. Robert E. Krieger PublishingCo. Inc., Melbourne, FL, second edition, 1986. 16, 34, 78, 80, 82, 83, 86

[62] P. Lévy. L’addition des variables aléatoires définies sur une circonférence. Bull. Soc. Math.France, 67:1–41, 1939. 2

[63] X. Li and J. Yong. Optimal control theory for infinite-dimensional systems. Systems & Control:Foundations & Applications. Birkhäuser Boston Inc., Boston, MA, 1995. 16, 42, 49, 50

[64] D. Liberzon. Calculus of variations and optimal control theory. Princeton University Press,Princeton, NJ, 2012. A concise introduction. 16, 26, 31

[65] H. Lin, H. Wu, and F. Mei. Variational integrators for fractional Birkhoffian systems. Nonlin-ear Dynamics, 87, 11 2016. 3

[66] R. Magin. Fractional calculus models of complex dynamics in biological tissues. Comput.Math. Appl., 59(5):1586–1593, 2010. 2

[67] M. Majewski. Existence of optimal solutions to Lagrange and Bolza problems for fractionalDirichlet problem via continuous dependence. 2014 19th International Conference on Meth-ods and Models in Automation and Robotics, MMAR 2014, pages 152–158, 11 2014. 4

[68] A. Malinowska and D. Torres. Introduction to the fractional calculus of variations. ImperialCollege Press, London, 2012. 3


[69] J. Marsden and M. West. Discrete mechanics and variational integrators. Acta Numer.,10:357–514, 2001. 3

[70] I. Matychyn and V. Onyshchenko. Time-optimal control of fractional-order linear systems.Fract. Calc. Appl. Anal., 18(3):687–696, 2015. 29

[71] M. Meerschaert, H.-P. Scheffler, and C. Tadjeran. Finite difference methods for two-dimensional fractional dispersion equation. J. Comput. Phys., 211(1):249–261, 2006. 3

[72] M. Meerschaert and C. Tadjeran. Finite difference approximations for fractional advection-dispersion flow equations. J. Comput. Appl. Math., 172(1):65–77, 2004. 3

[73] R. Metzler and J. Klafter. The random walk’s guide to anomalous diffusion: a fractional dy-namics approach. Phys. Rep., 339:1–77, 2000. 2

[74] K. Miller and B. Ross. An introduction to the fractional calculus and fractional differentialequations. A Wiley-Interscience Publication. John Wiley & Sons Inc., New York, 1993. 3

[75] T. Odzijewicz, A. Malinowska, and D. Torres. Fractional calculus of variations in terms of ageneralized fractional integral with applications to Physics. Abstr. Appl. Anal., 2012. 3

[76] T. Odzijewicz, A. Malinowska, and D. Torres. Fractional variational calculus of variable order.Advances in Harmonic Analysis and Operator Theory, 229:291–301, 2013. The Stefan SamkoAnniversary Volume (Eds: A. Almeida, L. Castro, F.-O. Speck). 3

[77] K. Oldham and J. Spanier. The replacement of Fick’s laws by a formulation involving semid-ifferentiation. J. Electroanal. Chem., 26:331–341, 1970. 2

[78] T. Pfitzenreiter. A physical basis for fractional derivatives in constitutive equations. Z. Angew.Math. Mech., 84(4):284–287, 2004. 2

[79] I. Podlubny. Fractional differential equations, volume 198 of Mathematics in Science andEngineering. Academic Press Inc., San Diego, CA, 1999. An introduction to fractional deriva-tives, fractional differential equations, to methods of their solution and some of their appli-cations. 3

[80] I. Podlubny. Matrix approach to discrete fractional calculus. Fract. Calc. Appl. Anal.,3(4):359–386, 2000. 3

[81] S. Pooseh, R. Almeida, and D. Torres. Fractional order optimal control problems with freeterminal time. J. Ind. Manag. Optim., 10(2):363–381, 2014. 29

[82] F. Riewe. Nonconservative Lagrangian and Hamiltonian mechanics. Phys. Rev. E (3),53(2):1890–1899, 1996. 2, 3

[83] F. Riewe. Mechanics with fractional derivatives. Phys. Rev. E (3), 55(3, part B):3581–3592,1997. 2, 3

[84] J. Sabatier, O. Agrawal, and J. T. Machado. Advances in fractional calculus. Springer, Dor-drecht, 2007. 2

[85] S. Samko, A. Kilbas, and O. Marichev. Fractional integrals and derivatives. Gordon andBreach Science Publishers, Yverdon, 1993. Theory and applications, Translated from the1987 Russian original. 4, 5, 7

[86] H. Schättler and U. Ledzewicz. Geometric optimal control, volume 38 of InterdisciplinaryApplied Mathematics. Springer, New York, 2012. Theory, methods and examples. 16


[87] S. Sethi and G. Thompson. Optimal control theory. Kluwer Academic Publishers, Boston,MA, second edition, 2000. Applications to management science and economics. 16, 34, 35,52

[88] Z. Tomovski, T. Sandev, R. Metzler, and J. Dubbeldam. Generalized space-time fractionaldiffusion equation with composite fractional time derivative. Phys. A, 391(8):2527–2542,2012. 3

[89] E. Trélat. Contrôle optimal. Mathématiques Concrètes. Vuibert, Paris, 2005. Théorie &applications. 16, 34, 55, 80, 82, 83, 86

[90] C. Tricaud and Y. Chen. Time-optimal control of systems with fractional dynamics. Int. J.Differ. Equ., pages Art. ID 461048, 16, 2010. 29

[91] B. van Brunt. The calculus of variations. Universitext. Springer-Verlag, New York, 2004. 31,32

[92] R. Vinter. Optimal control. Modern Birkhäuser Classics. Birkhäuser Boston, Inc., Boston,MA, 2010. Paperback reprint of the 2000 edition. 16, 34, 42, 44, 57

[93] D. Wang and A. Xiao. Fractional variational integrators for fractional variational problems.Communications in Nonlinear Science and Numerical Simulation, 17:602–610, 02 2012. 3

[94] H. Ye, J. Gao, and Y. Ding. A generalized Gronwall inequality and its application to a frac-tional differential equation. J. Math. Anal. Appl., 328(2):1075–1081, 2007. 20, 25, 26

[95] G. Zaslavsky. Chaos, fractional kinetics, and anomalous transport. Phys. Rep., 371(6):461–580, 2002. 2

[96] G. Zaslavsky. Hamiltonian chaos and fractional dynamics. Oxford University Press, Oxford,2008. Reprint of the 2005 original. 2

[97] M. Zelikin. Control theory and optimization. I, volume 86 of Encyclopaedia of Mathemati-cal Sciences. Springer-Verlag, Berlin, 2000. Homogeneous spaces and the Riccati equationin the calculus of variations, A translation of ıt Homogeneous spaces and the Riccati equa-tion in the calculus of variations (Russian), “Faktorial”, Moscow, 1998, Translation by S. A.Vakhrameev. 31

[98] A. Zoia, M.-C. Néel, and A. Cortis. Continuous-time random-walk model of transport invariably saturated heterogeneous porous media. Phys. Rev. E, 81(3), 2010. 2

[99] A. Zoia, M.-C. Néel, and M. Joelson. Mass transport subject to time-dependent flow withnonuniform sorption in porous media. Phys. Rev. E, 80, 2009. 2

Part II

Contributions to optimal sampled-datacontrol theory on time scales

Chapter 3

Pontryagin maximum principle for stateconstrained optimal sampled-datacontrol problems on time scales andbouncing trajectory phenomenon


3.2 Basics on time scale theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Main result and comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3.1 A sample-and-hold procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3.2 A state constrained optimal sampled-data control problem on time scales . 39

3.3.3 Pontryagin maximum principle and general comments . . . . . . . . . . . . 41

3.3.4 Preservation (or not) of some well known properties . . . . . . . . . . . . . . 44

3.3.5 An overview in several stages of the proof of Theorem 3.6 . . . . . . . . . . . 45

3.3.6 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.4 The observation of a bouncing trajectory phenomenon . . . . . . . . . . . . . . . 51

3.4.1 Heuristical discussion: the expected behavior of an admissible trajectory . . 52

3.4.2 Mathematical justifications: a sufficient condition for bouncing trajectories 54

3.4.3 Numerical experiments based on an indirect method . . . . . . . . . . . . . 55

3.5 Application to min-max optimal sampled-data control problems . . . . . . . . . 57

3.6 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

When I was a PhD student at the University of Pau (developing some aspects of the calculus of vari-ations and of the optimal control theory with fractional calculus), I contacted Emmanuel Trélatfrom the Pierre et Marie Curie University (Paris 6, France) for his expertise in optimal control the-ory. After some discussions, he brought to my attention that the continuous- and discrete-timeversions of the Pontryagin maximum principle slightly differ from each other. In 2012 we under-took together the research project of providing a unified version of the Pontryagin maximum prin-ciple for continuous- and discrete-time dynamics, by using the tools of time scale theory whoseaim is exactly to bridge the gap between continuous and discrete analyses. This project led us tothe publications [B12, B13], and also allowed us to realize that the time scale calculus is further-more a suitable tool in order to handle optimal sampled-data control problems. Later, in 2016,we published the paper [B16] in which a Pontryagin maximum principle is established for optimal

33

CHAPTER 3. PMP FOR STATE CONSTRAINED OPTIMAL SAMPLED-DATA CONTROLPROBLEMS ON TIME SCALES AND BOUNCING TRAJECTORY PHENOMENON

sampled-data control problems on time scales. Recently, with Piernicola Bettiol from the Univer-sity of Brest (France) in one hand [B22] and with my PhD student Gaurav Dhar from the Universityof Limoges in the other hand [B21], we decided to investigate state constrained optimal sampled-data control problems on time scales. The present chapter summarizes the contributions of thefour following references:

• [B12]: L. Bourdin and E. Trélat. Pontryagin maximum principle for finite dimensional non-linear optimal control problems on time scales. SIAM J. Control Optim., 51(5):3781–3813,2013.

• [B16]: L. Bourdin and E. Trélat. Optimal sampled-data control, and generalizations on timescales. Math. Control Relat. Fields, 6(1):53–94, 2016.

• [B21]: L. Bourdin and G. Dhar. Optimal sampled-data controls with running inequality stateconstraints - Pontryagin maximum principle and bouncing trajectory phenomenon. Sub-mitted, 2019.

• [B22]: P. Bettiol and L. Bourdin. Pontryagin maximum principle for state constrained opti-mal sampled-data control problems on time scales. Submitted, 2020.

3.1 Introduction

Continuous-time optimal (permanent) control theory. In this paragraph we will focus on acontinuous-time control system described by a general differential equation of the form

x(t ) = f (x(t ),u(t ), t ), a.e. t ∈ [0,T ], (3.1)

where x stands for the state and u is the control, and on the minimization problem of the Bolzacost given by

g (x(0), x(T ))+∫ T

0L(x(τ),u(τ),τ) dτ.

Note that the control in the control system (3.1) is permanent, in the sense that it can be modifiedat any time t ∈ [0,T ].

Established in [182] by Pontryagin, Boltyanskii, Gamkrelidze and Mischenko at the end of thefifties, the Pontryagin Maximum Principle (in short, PMP) is a fundamental result which givesfirst-order necessary optimality conditions to the above optimal (permanent) control problem.Roughly speaking, the PMP ensures the existence of an adjoint vector p (also called costate vector)satisfying some terminal conditions (called transversality conditions) such that the augmentedstate-costate vector satisfies a Hamiltonian system and such that the optimal control maximizesthe Hamiltonian associated with the optimal (permanent) control problem. This wonderful resulthas numerous theoretical and numerical applications. I refer to [122, 156, 170, 190, 195, 197] andreferences therein. In particular, as a well known numerical application, if the Hamiltonian maxi-mization condition allows to express the optimal control as an explicit function of the augmentedstate-costate vector, then the PMP induces the so-called indirect numerical method which con-sists in solving numerically the boundary value problem satisfied by the augmented state-costatevector via a shooting algorithm. Here the terminology indirect is opposed to the one of direct nu-merical methods (see Section 3.4.3 for some details on these two different numerical approaches).

Some remarkable extensions of the PMP. Soon afterwards [182] and even nowadays, the PMPhas been adapted to many situations, for control systems of different natures, with various con-straints, etc. It is not my aim to give a state of the art here. Nevertheless, for the needs of thisintroduction, let me emphasize that:

34


• The classical PMP has no exact analogue at the discrete level. Precisely several versions ofthe PMP were derived for discrete-time control systems of the form

xk+1 −xk = f (xk ,uk ,k), ∀k = 0, . . . ,T −1, (3.2)

where T ∈N∗, and where the cost is described by

g (x0, xT )+T−1∑k=0

L(xk ,uk ,k).

I refer to [117, 154, 160, 161] for instance. In these discrete-time versions of the PMP, theHamiltonian maximization condition does not hold in general (see a counterexample in [117,Examples 10.1-10.4]) and has to be replaced by a weaker condition known as the nonpositiveHamiltonian gradient condition (see e.g., [117, Theorem 42.1]). In the control system (3.2)(like in (3.1)), note that the control is permanent, in the sense that it can be modified at anytime k = 0, . . . ,T −1.

• An important extension of the PMP concerns the taking into account of state constraints, inwhich the state is restricted to a certain region of the state space. Indeed it is often undesir-able and even inadmissible in scientific and engineering applications that the state crossescertain limits imposed in the state space for safety or practical reasons. Many examples canbe found in mechanics and aerospace engineering (e.g., an engine may overheat or over-load), management and economics (e.g., an inventory level may be limited in a productionmodel), etc. I refer to [121, 128, 176, 190, 196] and references therein for some examples.Some versions of the PMP for state constrained continuous-time optimal (permanent) con-trol problems can be found in [175, Section 3] by Maurer and in [147, Theorem 14.1] byGirsanov. A comprehensive survey [155] of this field of research has been given in 1995 byHartl, Sethi and Vickson. Note that the PMP for state constrained continuous-time optimal(permanent) control problems is more intricate because, in contrast to the state constraint-free case, the adjoint vector obtained is not absolutely continuous in general but (only) ofbounded variation. Therefore theoretical and numerical difficulties may arise due to thepossible pathological behavior of the adjoint vector which consists in jumps and singularpart lying on parts of the state in contact with the boundary of the restricted state space.As a consequence a wide portion of the literature is devoted to the analysis of the behaviorof the adjoint vector and some constraint qualification conditions have been established inorder to ensure that the costate has no singular part (see, e.g., [110, 121, 135, 155, 163, 175]).I briefly conclude this item by mentioning that the related theme of Pontryagin maximumprinciples for state constrained discrete-time optimal (permanent) control problems hasalso been investigated in the literature (see, e.g., [183]).

Time scale calculus and contributions of the first paper [B12]. The time scale theory was intro-duced by Hilger in [157] in order to unify continuous and discrete analyses. By definition, a timescale T is an arbitrary nonempty closed subset of R, and a dynamical system is said to be posedon the time scaleTwhenever the time variable evolves along this setT. The continuous-time casecorresponds to T = R+ (for example) and the discrete-time case corresponds to T =N (for exam-ple). But a time scale can be much more general (a Cantor set for example). Many notions of stan-dard calculus (such as derivatives, integrals, etc.) have been extended to the time scale framework,and I refer the reader to [101, 102, 103, 115, 116] for details on that theory. I also refer to Section 3.2of the present chapter for some basics. The Cauchy–Lipschitz (or Picard–Lindelöf) theory has beenextended to differential equations posed on general time scales (see, e.g., [115, 157] and also [B13]whose contributions will not be presented in the present manuscript). For T = R+ for example,one recovers the classical theory of continuous-time differential equations, while, forT=N for ex-ample, one recovers the theory of difference equations. This provides an illustration that the timescale theory allows to close the gap between continuous and discrete analyses, and this is possible

35


in any mathematical domain in which time scale calculus can be involved. Another example isprovided in optimization with the calculus of variations on time scales, initiated in [113], and wellstudied in the existing literature (see, e.g., [142, 158, 174] and also [B14] whose contributions willnot be presented in the present manuscript).

The authors of [159] established a version of the PMP for control systems defined on general timescales, but obtaining (only) a nonpositive Hamiltonian gradient condition. In particular notethat the discrete-time PMP was recovered, but the classical continuous-time PMP was not. Astronger version of the PMP (with the Hamiltonian maximization condition) was claimed in [202]but many arguments thereof were erroneous (see its MathSciNet report for details). My first collab-oration [B12] with Trélat started in that context during my PhD thesis. Our objective was to provea time scale version of the PMP which unifies in one result the continuous- and discrete-time ver-sions of the PMP. Precisely we were able in [B12, Theorem 1] to obtain the Hamiltonian maximiza-tion condition at right-dense points of the time scaleT, and the nonpositive Hamiltonian gradientcondition at right-scattered points of T (see Section 3.2 for the precise definitions of right-denseand right-scattered points). In particular our result encompassed the classical continuous-timePMP (taking T = R+ for example), and also its discrete-time version (taking T = N for example).Before concluding this paragraph, I emphasize that we have encountered several technical dif-ficulties due to the general time scale setting. Indeed several standard methods used in order toprove the classical PMP in the literature fail in the general time scale context. I refer to Section 3.3.5for details on the obstructions and on the methods that we used in order to overcome them.

Sampled-data control systems and contributions of the second paper [B16]. The control in adynamical system is very often assumed to be permanent in the literature (like in (3.1) and (3.2)).In the present paragraph let us discuss the sampled-data control systems in which the state evolvescontinuously with respect to time while the control evolves discretely with respect to time. Such asystem has the form

x(t ) = f (x(t ),uk , t ), a.e. t ∈ [k,k +1), ∀k = 0, . . . ,T −1, (3.3)

where T ∈N∗, and the cost has the form

g (x0, xT )+T−1∑k=0

∫ k+1

kL(x(τ),uk ,τ) dτ.

In that context the control is usually said to be sampled-data in the literature. In particular thecontrol is nonpermanent in the sense that its value over [0,T ] is authorized to be modified at (only)precise instants k, which are called sampling times, and remains frozen elsewhere. From anotherpoint of view, one can see the control system (3.3) as the same than (3.1) but with the controlconstrained to be piecewise constant. Sampled-data control systems have been widely consideredas models in Engineering and Automation where, in numerous practical problems, the evolutionof the state is very quick with respect to that of the control. Numerous texts and articles havedeveloped the sampled-data control theory (see, e.g., [100, 106, 126, 140, 168, 185]).

Sampled-data control systems have the peculiarity of presenting a mixed continuous/discretestructure. The original idea developed in my second paper [B16] in collaboration with Trélat wasto extend the PMP obtained in our earlier paper [B12] to the case where the state is defined on ageneral time scale T, while the control is defined on another (possibly different) time scale T1 ⊂Tand is subject to a sampling procedure. In that context we established a PMP (see [B16, The-orem 2.6]) in which the Hamiltonian maximization condition is obtained at right-dense pointsof T1, and a nonpositive averaged Hamiltonian gradient condition is obtained at right-scatteredpoints of T1. In particular this new result encompassed the previous one obtained in [B12, The-orem 1] by taking T1 = T (and thus also the classical continuous- and discrete-time versions ofthe PMP). Above all it allowed to extend the PMP to various situations of optimal sampled-datacontrol problems. For example this PMP handles control systems such as (3.3) by taking T = R+

36


and T1 = N, but also allows to handle discrete-time optimal sampled-data control problems bytaking T = N and T1 = 2N for example. I conclude this paragraph by mentioning that the majordifficulty encountered in this work was to derive the correct variation vector associated to pertur-bation of the control at right-scattered points ofT1 (see Section 3.3.5 for details) and to determinethe suitable formulation of the necessary optimality condition as nonpositive averaged Hamilto-nian gradient condition which was, as far as we know, new in the literature at that time.

Contributions of the two works [B21, B22]. In the recent work [B22] written in collaborationwith Bettiol, our objective was to extend the PMP obtained in [B16] to the state constrained case(precisely with inequality state constraints). This is exactly the content of the present chapter.A general state constrained optimal sampled-data control problem on time scales is introducedin Section 3.3.2 and the corresponding PMP, which was originally obtained in [B22, Theorem 1], isrecalled in Theorem 3.6 in Section 3.3.3. A list of general comments is in order. In general (not onlyin the time scale or sampled-data control settting), the extension of the PMP for state constraint-free problems to state constrained problems is not trivial and requires several adjustments. I referto Section 3.3.5 for details on the difficulties encountered and on the methods that we used inorder to overcome them.

As in the classical case T = T1 = R+, the adjoint vector obtained in Theorem 3.6 (in the generalsituationT1 ⊂T of time scales) is not absolutely continuous in general but (only) of bounded vari-ation and, therefore, some difficulties arise in theoretical and numerical applications due to thepossible pathological behavior of the adjoint vector. In the recent work [B21] in collaboration withDhar, we focused on state constrained continuous-time optimal sampled-data control problems(that is, taking T = R+ and T1 = N for example). At this occasion we observed that the optimaltrajectories returned by direct numerical approaches had a common behavior with respect to thestate constraint. Precisely the optimal trajectories were “bouncing" on it. We refer to this newphenomenon (which does not appear in the classical case T=T1 =R+) as the bouncing trajectoryphenomenon. And indeed, we were able in [B21, Proposition 4.1] to prove, in case of sampled-datacontrol and under (quite unrestrictive) hypotheses, that an optimal trajectory necessarily bouncesagainst the boundary of the restricted state space and, moreover, that the rebounds occur at mostat exactly the sampling times (and thus in a finite number and at precise instants). Inherent to thisbehavior, the singular part of the adjoint vector vanishes and its discontinuities are reduced to afinite number of jumps which are localized at most at the sampling times. This feature presentsseveral benefits from a numerical point of view in indirect methods (see Section 3.4 for details).

3.2 Basics on time scale theory

Let us start with some basic definitions and results employed in time scale theory, essentially ex-tracted from the two monographs [115, 116] by Bohner and Peterson. A time scale T is an arbitrarynonempty closed subset of R. In this chapter, without loss of generality, we will assume that T isbounded below, denoting by a := minT, and unbounded above. In what follows,Twill be the timescale on which the state of the control system evolves. The forward jump operator σ : T→ T isdefined by σ(t ) := infτ ∈T | τ> t for every t ∈T. A point t ∈T is said to be right-scattered when-ever σ(t ) > t . A point t ∈ T is said to be right-dense whenever σ(t ) = t . We denote by RS the setof all right-scattered points of T, and by RD the set of all right-dense points of T. Recall that RS isat most countable (see [124, Lemma 3.1]) and that RD is the complement of RS in T. The graini-ness function µ : T→ R+ is defined by µ(t ) := σ(t )− t for every t ∈ T. For every subset A of R, wewrite AT := A∩T.

∆-differentiability. Let n ∈N∗. A function x : T→ Rn is said to be ∆-differentiable at t ∈ T if thelimit

x∆(t ) := limτ→tτ∈T

xσ(t )−x(τ)

σ(t )−τ ,

37


where xσ := x σ, exists in Rn . Recall that, if s ∈ RD, then x is ∆-differentiable at s if and onlyif the limit of x(s)−x(τ)

s−τ , as τ → s with τ ∈ T, exists in Rn . In that case it is equal to x∆(s). If r ∈RS and x is continuous at r , then x is ∆-differentiable at r with x∆(r ) = xσ(r )−x(r )

µ(r ) (see, e.g., [115,Theorem 1.16]).

If two functions x, x ′ :T→ Rn are both ∆-differentiable at t ∈T, then the scalar product ⟨x, x ′⟩Rn :T→R is ∆-differentiable at t with

⟨x, x ′⟩∆Rn (t ) = ⟨x∆(t ), x ′σ(t )⟩Rn +⟨x(t ), x ′∆(t )⟩Rn = ⟨x∆(t ), x ′(t )⟩Rn +⟨xσ(t ), x ′∆(t )⟩Rn .

These equalities are usually called Leibniz formulas (see, e.g., [115, Theorem 1.20]) and I underlinethat a shift σ is involved in the time scale setting.

Lebesgue∆-measure and Lebesgue∆-integrability. Let µ∆ be the Lebesgue ∆-measure on T de-fined in terms of Carathéodory extension (see [116, Chapter 5]). We also refer the reader to [103,107, 124, 151] for more details. For all (c,d) ∈ T2 such that c ≤ d , one has µ∆([c,d)T) = d − c. Re-call that A ⊂T is a µ∆-measurable set of T if and only if A is a µL -measurable set of R, where µL

denotes the usual Lebesgue measure on R (see [124, Proposition 3.1]), and we have

µ∆(A) =µL (A)+ ∑r∈A∩RS

µ(r ).

Let A ⊂ T be µ∆-measurable subset of T. A property is said to hold ∆-almost everywhere (inshort, ∆-a.e.) on A if it holds for every t ∈ A\A′, where A′ ⊂ A is some µ∆-measurable set of Tsatisfying µ∆(A′) = 0. In particular, since µ∆(r ) = µ(r ) > 0 for every r ∈ RS, we conclude that, if aproperty holds ∆-a.e. on A, then it holds for every r ∈ A∩RS. Similarly, if µ∆(A) = 0, then A ⊂ RD.

The functional space L∞∆ (A,Rn) is the set of all functions x defined ∆-a.e. on A, with values in Rn ,

that are µ∆-measurable on A and bounded ∆-almost everywhere. Endowed with the usual norm‖x‖L∞

∆ (A,Rn ) := supessτ∈A ‖x(τ)‖Rn , it is a Banach space (see [103, Theorem 2.5]). The functionalspace L1

∆(A,Rn) is the set of all functions x defined ∆-a.e. on A, with values in Rn , that are µ∆-measurable on A and such that

∫A ‖x(τ)‖Rn ∆τ<+∞. Endowed with the usual norm ‖x‖L1

∆(A,Rn ) :=∫A ‖x(τ)‖Rn ∆τ, it is a Banach space (see [103, Theorem 2.5]). We recall here that if x ∈ L1

∆(A,Rn)then ∫

Ax(τ)∆τ=

∫A

x(τ)dτ+ ∑r∈A∩RS

µ(r )x(r ),

(see [124, Theorems 5.1 and 5.2]). Note that if A is bounded then L∞∆ (A,Rn) ⊂ L1

∆(A,Rn).

Absolutely continuous functions and functions of bounded variation. Take (c,d) ∈ T2 suchthat c < d . Let C([c,d ]T,Rn) denote the space of continuous functions defined on [c,d ]T with val-ues in Rn . Endowed with its usual uniform norm ‖·‖∞, it is a Banach space. Let AC([c,d ]T,Rn) de-note the subspace of absolutely continuous functions. Recall that, as in the classical continuous-time literature, absolutely continuous functions on time scales can be characterized by the funda-mental theorem of calculus (integral representation). I refer to [123] (or [B12, p.3784]) for details.

Let BV([c,d ]T,Rn) stand for the space of functions of bounded variation defined on [c,d ]T takingvalues in Rn , that is, the space of functions x : [c,d ]T→Rn such that

suptk k

∑k‖x(tk+1)−x(tk )‖Rn <+∞,

where the supremum is taken over all finite partitions tk k of [c,d ]T. As in the classical continuous-time literature, it can be proved that, if dη is a finite nonnegative Borel measure on [c,d ]T andx ∈ C([c,d ]T,Rn), then the function y : [c,d ]T→Rn defined by

y(t ) :=∫

[c,t ]Tx(τ) dη(τ),

for all t ∈ [c,d ]T, belongs to BV([c,d ]T,Rn). I refer to [162] (or [B22, Section 2]) for more details.

38


3.3 Main result and comments

This section is dedicated to the statement of the main result of the present chapter (see Theo-rem 3.6) which directly follows from the collaborations [B12, B16] with Trélat in one hand, andfrom the work [B22] with Bettiol in the other hand. In Section 3.3.1 I first give some reminderson a sampling procedure on time scales originally introduced in [B16, p.60]. In Section 3.3.2 I in-troduce a general state constrained optimal sampled-data control problem on time scales, and Ifix the terminology and assumptions used all along the chapter. In Section 3.3.3 I finally enunci-ate the corresponding Pontryagin maximum principle and a list of comments is in order (notablySection 3.3.4 is dedicated to the preservation, or not, of some well known properties to the timescale and sampled-data control setting). An overview of the proof of Theorem 3.6 is given in Sec-tion 3.3.5. Finally the section is concluded with some perspectives in Section 3.3.6.

3.3.1 A sample-and-hold procedure

Let T1 be a second time scale, possibly different from the reference one T introduced in Sec-tion 3.2. Throughout the chapter, T1 will be the time scale on which the control of the controlsystem evolves. Since the value of the control at times t ∈T1\Twould not influence the dynamics,we assume without loss of generality that T1 ⊂T. As for T, we assume that minT1 = a and that T1

is unbounded above. In accordance with the previous section, we use the notation σ1, RS1, RD1,∆1, etc., for the analytical tools relative to the time scale T1. Since T1 ⊂ T, we have RD1 ⊂ RDand RS∩T1 ⊂ RS1.

A sample-and-hold procedure from T1 to T involves defining an operator that extends to T anyfunction defined on T1, by freezing the values on T\T1 in the sense given by Definition 3.1 below.In order to introduce this specific sampling procedure, we define the map

: T −→ T1

t 7−→ (t ) := supτ ∈T1 | τ≤ t .

For every t ∈T1, we have (t ) = t . For every t ∈T\T1, we have (t ) ∈ RS1 and (t ) < t <σ1((t )).

Definition 3.1 (Sample-and-hold procedure). Let m ∈N∗ and u :T1 →Rm be a given function. Inthis chapter, the sampled-data function associated with u is the function u :T→ Rm defined bythe composition u := u .

Example 3.2. Let m ∈ N∗ and consider T = R+ and T1 = N. If u : N→ Rm is a given function,then the corresponding sampled-data function u : R+ → Rm is the piecewise constant functiondefined by u(t ) := u(k) for all t ∈ [k,k +1) and all k ∈N.

3.3.2 A general state constrained optimal sampled-data control problem on time scales

Let T1 ⊂ T be the two (possibly different) time scales introduced in Sections 3.2 and 3.3.1 (bothunbounded above and both bounded below with a = minT= minT1). Let b ∈T be such that a < band let m, n, j and ` ∈N∗ be four fixed positive integers. In this chapter we focus on the generalstate constrained optimal sampled-data control problem on time scales given by

minimize g (x(a), x(b))+∫

[a,b)TL(x(τ),u(τ),τ)∆τ,

subject to x ∈ AC([a,b]T,Rn), u ∈ L∞∆1

([a,b)T1 ,Rm),

x∆(t ) = f (x(t ),u(t ), t ), ∆-a.e. t ∈ [a,b)T,

ψ(x(a), x(b)) ∈ S,

hi (x(t ), t ) ≤ 0, ∀t ∈ [a,b]T, ∀i = 1, . . . , j ,

u(t ) ∈ U, ∆1-a.e. t ∈ [a,b)T1 ,

(P)

39


where g : Rn ×Rn → R, L : Rn ×Rm × [a,b]T → R, f : Rn ×Rm × [a,b]T → Rn , ψ : Rn ×Rn → R`,and h = (hi )i=1,..., j : Rn × [a,b]T→ R j are given functions, and where U ⊂ Rm and S ⊂ R` are givensets.

A couple (x,u) ∈ AC([a,b]T,Rn) × L∞∆1

([a,b)T1 ,Rm) is said to be admissible for Problem (P) if itsatisfies all its constraints. Problem (P) is said to be feasible if it admits at least one admissiblecouple. A solution to Problem (P) is an admissible couple (x∗,u∗) which minimizes the Bolzacost g (x(a), x(b))+ ∫

[a,b)TL(x(τ),u(τ),τ) ∆τ among all admissible couples. In Problem (P), x is

called the state function (also called the trajectory) and u is called the control function.

In the case where T1 = T, the control is said to be permanent in Problem (P) because, in thatsituation, its value in the dynamical system can be modified at any time t ∈ T. Otherwise, inthe case where T1 ( T, the control is said to be nonpermanent (or sampled-data in the sense ofDefinition 3.1) because its value in the dynamical system can be modified only at times t ∈T1 andremains frozen elsewhere.

Throughout this chapter we fix the following terminology and regularity/topology hypotheses:

• the dynamics function f : Rn ×Rm × [a,b]T → Rn , which drives the state equation x∆(t ) =f (x(t ),u(t ), t ), is continuous and of class C1 with respect to its first two variables;

• the Mayer cost function g :Rn ×Rn →R is of class C1;

• the Lagrange cost function L :Rn×Rm×[a,b]T→R satisfies the same regularity assumptionsthan f ;

• the control constraint set U ⊂Rm is a nonempty closed convex subset of Rm ;

• the functionψ :Rn×Rn →R`, which describes the terminal state constraint ψ(x(a), x(b)) ∈ S,is of class C1 and the set S ⊂R` is a nonempty closed convex subset of R`;

• the function h = (hi )i=1,..., j : Rn × [a,b]T → R j , which describes the inequality state con-straints hi (x(t ), t ) ≤ 0, is continuous and of class C1 in its first variable.

Remark 3.3. The general time scale framework considered in the formulation of Problem (P) al-lows to recover several typical situations such as continuous-time or discrete-time optimal controlproblems, with permanent or sampled-data controls (see Remark 3.7 for details). The general ter-minal state constraint ψ(x(a), x(a)) ∈ S in Problem (P) allows to recover various situations such asfixed/free terminal conditions, equality/inequality constraints, mixed initial/final conditions, etc.(see Remark 2.36 in the previous chapter for details). Finally, by considering j = 1 and h ≡−1, notethat the formulation of Problem (P) also allows to recover the state constraint-free case.

Remark 3.4. In this chapter, for the sake of simplicity, I decided to consider that the final time bis fixed in Problem (P). Nevertheless we have considered in the works [B12, B16] the possibilityof taking the final time b being free (in view of dealing with minimal time problems for example).I refer to Remark 3.16 for some details on that context (in which the Mayer cost function g canpossibly depend also on b).

The main objective in the works [B12, B16, B22] was to establish first-order necessary optimal-ity conditions for Problem (P) in a Pontryagin form. Regarding existence results, under someappropriate compactness and convexity assumptions (in the same spirit than Theorem 2.28), aFilippov-type existence result can be obtained. Precisely let E ⊂ C([a,b]T,Rn) stand for the set ofall trajectories x ∈ AC([a,b]T,Rn) that can be associated to a control u ∈ L∞

∆1([a,b)T1 ,Rm) such that

the couple (x,u) is admissible for Problem (P). Obviously, if E is empty, then Problem (P) has nosolution. Otherwise, the following existence result holds true.

Theorem 3.5 (Filippov existence theorem). Assume that E is nonempty and is bounded in the spaceC([a,b]T,Rn), U is compact and ( f ,L+)(x,U, t ) is convex for all (x, t ) ∈Rn × [a,b]T (see the notationintroduced at the beginning of Section 2.3.3). Then Problem (P) has at least one optimal solution.

40


I refer to [B16, Theorem 2.1] in which the above result has been established (only in the stateconstraint-free case but the techniques can be easily extended to the present context from thecontinuity of h).

3.3.3 Pontryagin maximum principle and general comments

Before providing a Pontryagin maximum principle, let us introduce the Hamiltonian H :Rn×Rm×Rn ×R× [a,b]T→R associated to Problem (P) defined by

H(x,u, q,λ, t ) := ⟨q, f (x,u, t )⟩Rn −λL(x,u, t ),

for all (x,u, q,λ, t ) ∈ Rn ×Rm ×Rn ×R× [a,b]T. The main result of the present chapter is givenin the next theorem (recall that the notions of normal cone and submersive map have been bothintroduced right before Theorem 2.30).

Theorem 3.6 (Pontryagin maximum principle). If (x∗,u∗) ∈ AC([a,b]T,Rn)×L∞∆1

([a,b)T1 ,Rm) is asolution to Problem (P) andψ is submersive at (x∗(a), x∗(b)), then there existλ≥ 0, p ∈ AC([a,b]T,Rn)and finite nonnegative Borel measures dη1, . . . ,dη j on [a,b]T such that the following conditions aresatisfied:

(i) Nontriviality: (p,λ,dη1, . . . ,dη j ) 6= 0;

(ii) Adjoint equation:

−p∆(t ) =∇x H(x∗(t ),u∗(t ), q(t ),λ, t ), ∆-a.e. t ∈ [a,b)T;

(iii) Transversality condition:(p(a)−q(b)

)=λ∇g (x∗(a), x∗(b))+∇ψ(x∗(a), x∗(b))>×ξ,

where ξ ∈ NS[ψ(x∗(a), x∗(b))];

(iv) Hamiltonian conditions:

(iv-a) Hamiltonian maximization condition at right-dense points:

u∗(s) ∈ argmaxv∈U

H(x∗(s), v, q(s),λ, s), ∆1-a.e. s ∈ [a,b)T1 ∩RD1;

(iv-b) Nonpositive averaged Hamiltonian gradient condition at right-scattered points:⟨∫[r,σ1(r ))T

∇u H(x∗(τ),u∗(r ), q(τ),λ,τ)∆τ, v −u∗(r )

⟩Rm

≤ 0,

for all v ∈ U and all r ∈ [a,b)T1 ∩RS1;

(v) Complementary slackness condition:

supp(dηi ) ⊂ t ∈ [a,b]T | hi (x∗(t ), t ) = 0, ∀i = 1, . . . , j ,

where supp(dηi ) stands for the classical notion of support of the measure dηi .

Here q ∈ BV([a,b]T,Rn) is defined by

q(t ) :=

pσ(t )+

j∑i=1

∫[a,t ]T

∇x hi (x∗(τ),τ) dηi (τ) if t ∈ [a,b)T,

p(b)+j∑

i=1

∫[a,b]T

∇x hi (x∗(τ),τ) dηi (τ) if t = b,

for all t ∈ [a,b]T.

41


Theorem 3.6 has been gradually estabished in the works [B12, B16, B22]. The proof will be com-mented in details in Section 3.3.5. Hereafter a list of general comments is in order.

Remark 3.7. The general time scale framework considered in the formulation of Problem (P) andthe two different Hamiltonian conditions obtained in Theorem 3.6 (according to right-dense orright-scattered points) allow to recover several typical situations, among which:

• Continuous-time optimal (permanent) control problems. For example, taking T = T1 = R+with (a,b) = (0,T ) for some T > 0, we recover the usual situation where the state equation inProblem (P) is described by a standard controlled differential equation x(t ) = f (x(t ),u(t ), t )over [0,T ] and where the Bolza cost is given by g (x(0), x(T )) + ∫ T

0 L(x(τ),u(τ),τ)dτ. TheHamiltonian condition in Theorem 3.6 corresponds to the standard Hamiltonian maximiza-tion condition given by

u∗(s) ∈ argmaxv∈U

H(x∗(s), v, q(s),λ, s),

for almost every s ∈ [0,T ]. Therefore Theorem 3.6 encompasses the historical Pontryaginmaximum principle obtained in [182] (in the state constraint-free case) and in [155, 172, 197](in the state constrained case).

• Discrete-time optimal (permanent) control problems. For example, taking T = T1 = N with(a,b) = (0,T ) for some T ∈ N∗, we recover the usual situation where the state equation inProblem (P) is described by a standard controlled difference equation xk+1−xk = f (xk ,uk ,k)for all k = 0, . . . ,T −1 and where the Bolza cost writes g (x(0), x(T ))+∑T−1

k=0 L(xk ,uk ,k). TheHamiltonian condition in Theorem 3.6 corresponds to the nonpositive Hamiltonian gradi-ent condition given by ⟨∇u H(x∗

k ,u∗k , qk ,λ,k), v −u∗

k

⟩Rm ≤ 0,

for all v ∈ U and all k = 0, . . . ,T −1. Therefore Theorem 3.6 encompasses the discrete-timeversion of the Pontryagin maximum principle obtained in [117] (in the state constraint-freecase, in which qk = pk+1 for all k = 0, . . . ,T −1) and in [183] (in the state constrained case).

• Continuous-time optimal sampled-data control problems. For example, taking T = R+ andT1 = N with (a,b) = (0,T ) for some T ∈ N∗, we obtain from Example 3.2 that the stateequation in Problem (P) is described by the differential equation with sampled-data controlgiven by x(t ) = f (x(t ),uk , t ) over all intervals [k,k +1), and where the Bolza cost is given byg (x(0), x(T ))+∑T−1

k=0

∫ k+1k L(x(τ),uk ,τ)dτ. The Hamiltonian condition in Theorem 3.6 gives

the following nonpositive averaged Hamiltonian gradient condition⟨∫ k+1

k∇u H(x∗(τ),u∗

k , q(τ),λ,τ) dτ, v −u∗k

⟩Rm

≤ 0,

for all v ∈ U and all k = 0, . . . ,T −1. Note that Trélat and myself have dedicated the proceed-ing [B15] to this result in the state constraint-free case and that it was extended to the stateconstrained case in [B21] written in collaboration with Dhar. The above nonpositive aver-aged Hamiltonian gradient condition in the context of the present item was, as far as I know,new in the literature at that time.

• Continuous-time optimal parameter problems. For example, taking T = R+ with (a,b) =(0,T ) for some T > 0, and T1 = 0 ∪ [T,+∞), we obtain that the state equation in Prob-lem (P) is described by the parameterized differential equation given by x(t ) = f (x(t ),u, t )over [0,T ], where u ∈ U ⊂Rm plays the role of a parameter, and where the Bolza cost is givenby g (x(0), x(T ))+ ∫ T

0 L(x(τ),u,τ)dτ. The Hamiltonian condition in Theorem 3.6 gives thefollowing nonpositive averaged Hamiltonian gradient condition⟨∫ T

0∇u H(x∗(τ),u∗, q(τ),λ,τ) dτ, v −u∗

⟩Rm

≤ 0,

for all v ∈ U.

42


Hence the general time scale framework allows to deal simultaneously with the four above typicalsituations, but also permits to extend to more general situations. For example one can deal withdiscrete-time optimal sampled-data control problems by taking, for example, T =N and T1 = 2Nwith (a,b) = (0,T ) for some T ∈ 2N∗. One can also investigate continuous-time optimal perma-nent control problems with some noncontrol intervals by taking, for example, T = R+ and T1 =∪k∈N[2k,2k+1] with (a,b) = (0,T ) for some T ∈ 2N∗. The details of the formulation of Problem (P)and of the conclusions of Theorem 3.6 in these two last contexts are left to the reader.

Remark 3.8. As is well known in optimal control theory, the nontrivial tuple (p,λ,dη1, . . . ,dη j )obtained in Theorem 3.6, which is a Lagrange multiplier, is defined up to a positive multiplicativescalar. It is said to be normal whenever λ > 0, and abnormal whenever λ = 0. In the normalcase λ> 0, it is usual to normalize the Lagrange multiplier so that λ= 1.

Remark 3.9. Our strategy of proof of Theorem 3.6 in the works [B12, B16, B22] was based on theEkeland variational principle [138, Theorem 1.1] which is, in contrary to some other usual methodsused in classical optimal control theory, suitable in order to deal with the general time scale setting.I refer to Section 3.3.5 for a detailed discussion on that technical point. This approach requires theclosedness of U in order to define the corresponding penalized functional on a complete metricset. Thus the closure of U is a crucial assumption in the above works. However, note that it ispossible to slightly extend Theorem 3.6 to the case where U is not convex, by using the conceptof stable U-dense directions introduced in [B12, Section 2.2] (which will not be discussed in thepresent manuscript).

Remark 3.10. The nonpositive averaged Hamiltonian gradient condition in Theorem 3.6 can bewritten as ∫

[r,σ1(r ))T∇u H(x∗(τ),u∗(r ), q(τ),λ,τ)∆τ ∈ NU[u∗(r )],

for all r ∈ [a,b)T1 ∩RS1, which is equivalent to the fixed-point formulation

u∗(r ) = projU

(u∗(r )+

∫[r,σ1(r ))T

∇u H(x∗(τ),u∗(r ), q(τ),λ,τ)∆τ

),

for all r ∈ [a,b)T1 ∩RS1, where projU : Rm → U stands for the classical projection operator onto U.In particular, when U = Rm (no control constraint), note that the Hamiltonian conditions in The-orem 3.6 imply that

∇u H(x∗(s),u∗(s), q(s),λ, s) = 0Rm ,

for ∆1-a.e. s ∈ [a,b)T1 ∩RD1, and∫[r,σ1(r ))T

∇u H(x∗(τ),u∗(r ), q(τ),λ,τ)∆τ= 0Rm ,

for all r ∈ [a,b)T1 ∩RS1.

Remark 3.11. For the description of some typical situations of terminal state constraint in Prob-lem (P) and, in a similar spirit, of the corresponding transversality conditions derived in Theo-rem 3.6, I refer to Remark 2.36 in the previous chapter.

Remark 3.12. Note that the submersiveness assumption in Theorem 3.6 is not restrictive. Indeed,if (x∗,u∗) is a solution to Problem (P) which does not satisfy the submersion property, then onecan easily go back to the submersive case by noting that (x∗,u∗) is also a solution to the sameproblem than Problem (P) but replacing ` by ˜ := 2n, ψ by the identity function ψ and S by thesingleton S := (x∗(a), x∗(b)). With this new problem, the submersion property is obviously sat-isfied and Theorem 3.6 can be applied. However, with this new problem, the normal cone to S isthe entire space, and thus the transversality condition does not provide any information. In otherwords, if the submersion property is not satisfied, then Theorem 3.6 is still valid by removing (only)the item (iii).

43


Remark 3.13. As in state constrained continuous-time optimal permanent control problems (see,e.g., [155, 197]), the vector p (resp. the vector q) provided in Theorem 3.6 is called AC-adjoint vector(resp. BV-adjoint vector). Note that the terminology costate vector is also frequently used in theliterature. Up to the presence of the shift σ, the AC-adjoint vector p corresponds to an absolutelycontinuous part of the BV-adjoint vector q , and the difference between them can be expressed

in terms of∑ j

i=1

∫[a,t ]T

∇x hi (x∗(τ),τ) dηi (τ). From the complementary slackness condition, wededuce that this difference (containing possibly discontinuity jumps and singular parts) lies onlywhen the inequality state constraints hi (x∗(t ), t ) ≤ 0 are active, that is, when hi (x∗(t ), t ) = 0 forsome i = 1, . . . , j . This behavior is well illustrated in Sections 3.4 and 3.5 in which Theorem 3.6 isapplied in order to solve numerically some state constrained continuous-time optimal sampled-data control problems.

Remark 3.14. Note that the necessary optimality conditions of Theorem 3.6 are of interest onlywhen the inequality state constraints are nondegenerate, in the sense that ∇x hi (x∗(t ), t ) 6= 0Rn

whenever hi (x∗(t ), t ) = 0 for some i = 1, . . . , j . I refer to [197, Remark (b) p.330] for a similar com-ment in the case of (nonsmooth) state constrained continous-time optimal permanent controlproblems.

3.3.4 Preservation (or not) of some well known properties

In this section I recall some basic properties that occur in continuous-time optimal permanentcontrol theory (taking T = T1 = R+ for example). My aim is to discuss their extension (or not)to the general optimal sampled-data control theory on time scales. This discussion is extractedfrom the works [B12, B15, B16] in which it is accompanied with examples and counterexamples.Unfortunately, for the sake of brevity, these illustrations will not be recalled in the present chapter,but I will referred to.

Remark 3.15. As is known in discrete-time optimal permanent control theory, and a fortiori in op-timal sampled-data control theory on time scales, the classical Hamiltonian maximization condi-tion does not hold true in general at right-scattered points ofT1, in which it is replaced by the non-positive averaged Hamiltonian gradient condition. I refer to [117, Examples 10.1-10.4] (recalledin [B12, Example 7]) for a counterexample in a discrete-time optimal permanent control problem(roughly speaking withT=T1 =N) and to [B16, Remark 18] for a counterexample in a continuous-time optimal sampled-data control problem (roughly speaking withT=R+ andT1 =N). I empha-size that, in the context of Theorem 3.6 and under some additional convexity assumptions on thedynamics (such as the one introduced by Holtzman and Halkin in [161]), to the best of my knowl-edge, it should be possible to obtain the averaged Hamiltonian maximization condition given by

u∗(r ) ∈ argmaxv∈U

∫[r,σ1(r ))T

H(x∗(τ), v, q(τ),λ,τ)∆τ,

for all r ∈ [a,b)T1 ∩RS1. Taking T = T1 =N for example, one would recover the main result of thework [161] in which the authors recovered the Hamiltonian maximization condition for discrete-time optimal permanent control problems satisfying a directional convexity assumption.

Remark 3.16. In this remark, for simplicity, we focus on the state constraint-free case (taking j = 1and h ≡−1). Consider the framework of Theorem 3.6 and let us introduce the maximized Hamil-tonian function H : [a,b]T→R defined by

H (t ) := H(x∗(t ),u∗(t ), q(t ),λ, t ) = H(x∗(t ),u∗(t ), pσ(t ),λ, t ),

for ∆-a.e. t ∈ [a,b)T. In a continuous-time optimal permanent control problem (taking T = T1 =R+ with (a,b) = (0,T ) for some T > 0), it is well known that the two following properties are satis-fied:

44


(i) H can be identified on [0,T ] to an absolutely continuous function which satisfies

H (t ) =∇t H(x∗(t ),u∗(t ), p(t ),λ, t ),

for a.e. t ∈ [0,T ] (see, e.g., [141, Theorem 2.6.3]). In particular, if H is autonomous (that is,does not depend on t ), then H is constant.

(ii) Furthermore, if the final time T is free in Problem (P) (which is not discussed in the presentchapter) and (x∗,u∗,T ∗) is an optimal triplet with T ∗ > 0, then the Lagrange multiplier pro-vided in Theorem 3.6 can be selected such that H (T ∗) = 0.

Now let us discuss the preservation (or not) of the two above properties in the general time scaleand sampled-data control setting. The property (i) is not true in optimal sampled-data control the-ory on time scales in general. I refer to [B12, Example 8] for a counterexample in a discrete-timeoptimal permanent control problem (roughly speaking with T=T1 =N) and to [B16, Remark 18]for a counterexample in a continuous-time optimal sampled-data control problem (roughly speak-ing with T=R+ and T1 =N). I refer to Chapter 4 for a developed discussion on that precise point.Indeed it turns out that the property (i) can be recovered in the case where T = R+ and T1 ⊂ T isdiscrete and optimal in a sense that will be precised in Chapter 4. Now let us discuss the preserva-tion (or not) of the property (ii). In the present chapter, I decided to consider that the final time bis fixed in Problem (P). Nevertheless we have considered in the works [B12, B16] the possibility oftaking the final time b being free. In that context, if (x∗,u∗,b∗) is an optimal triplet and b∗ belongsto the R-interior of T, then it can be proved that the Lagrange multiplier provided in Theorem 3.6can be selected such that H coincides almost everywhere, in some R-neighborhood of b∗, with acontinuous function vanishing at t = b∗. I refer to the last item of [B16, Theorem 2.6] and to [B16,Remarks 8 and 9] for details. However I would like to emphasize that this result does not hold trueif b∗ does not belong to the R-interior ofT. Indeed a counterexample with a discrete-time optimalpermanent control problem (roughly speaking with T=T1 =N) with free final time can be foundin [B12, Example 8].

Remark 3.17. In this remark, for simplicity, we still focus on the state constraint-free case (tak-ing j = 1 and h ≡−1). Consider the framework of Theorem 3.6 and assume that the Hamiltonian His affine in u in the sense that it can be written as

H(x,u, q,λ, t ) = ⟨H1(x, q,λ, t ),u⟩Rm +H2(x, q,λ, t ),

for all (x,u, q,λ, t ) ∈Rn×Rm×Rn×R×[a,b]T, where H1 :Rn×Rn×R×[a,b]T→Rm and H2 :Rn×Rn×R× [a,b]T→ R. In optimal permanent control problems (that is, when T=T1), it can be deducedfrom the Hamiltonian conditions of Theorem 3.6 that H1(x∗(t ), q(t ),λ, t ) ∈ NU[u∗(t )] for∆-almostevery t ∈ [a,b)T. We deduce that the optimal permanent control u∗ must take its values on theboundary of U for ∆-almost every t ∈ [a,b)T such that H1(x∗(t ), q(t ),λ, t ) 6= 0Rm . This property isknown as saturation of the control constraint set. It turns out that this property is not extendedto the case of sampled-data controls. I refer to [B16, Remarks 25 and 26] for a counterexample.Inthere the control constraint set is given by U = [0,1] and the optimal permanent control is bang-bang, taking first the value 1 and then the value 0, while the optimal sampled-data control takessome moderated values in the interior (0,1).

3.3.5 An overview in several stages of the proof of Theorem 3.6

In this section we assume, for sake of simplicity and without loss of generality, that there is no La-grange cost in Problem (P) (that is, L ≡ 0). As is well known in optimal control theory, this situationcan be easily recovered by considering an augmented version of the state equation (see, e.g., [118,Section 2.1.4]).

When Trélat and myself started our collaboration in the view of getting a time scale version of thePontryagin maximum principle, our preliminary paper [B13] was motivated by the needs of com-pleting the existing literature on Cauchy–Lipschitz theory (also known as Picard–Lindelöf theory)

45


for general nonlinear differential equations posed on time scales. Indeed, at this time, most ofthe literature on that topic was concerned with dynamics satisfying some continuity assumptions,which was not an appropriate setting in order to deal with control systems in which the controlmay be discontinuous. Furthermore some well known results from the classical theory were notextended yet to the time scale setting. For example, the alternative theorem (about the behaviorof maximal but nonglobal solutions to Cauchy problems) was not addressed yet in the literature,while this result plays a crucial role in order to prove that the set of admissible controls for glob-ality is open in a certain sense (see, e.g., [B16, Lemma 4.3]). Therefore Trélat and myself extendedin [B13] the classical Cauchy–Lipschitz theory to general Carathéodory dynamics posed on timescales. Under appropriate assumptions, we derived some existence-uniqueness results for maxi-mal solutions and we proved that any maximal solution which is not global is unbounded on itsinterval of definition. The contributions of the article [B13] will not be more developed in thepresent manuscript.

Once the preliminary paper [B13] has been written, the proof of Theorem 3.6 has been decom-posed in several stages in the works [B12, B16] in collaboration with Trélat, and recently in thework [B22] written with Bettiol. Precisely:

• In the first paper [B12], Theorem 3.6 has been proved in the case of a general state constraint-free optimal permanent control problems on time scales, that is, by taking T1 = T (perma-nent control) and j = 1 and h ≡−1 (state constraint-free case);

• In the second article [B16], Theorem 3.6 has been proved in the case of a general stateconstraint-free optimal sampled-data control problems on time scales, that is, by takingT1 ⊂T (possibly different, so sampled-data control) and j = 1 and h ≡ −1 (state constraint-freecase);

• In the third work [B22], Theorem 3.6 has been proved in the case of a general state con-strained optimal sampled-data control problems on time scales, that is, in its entirety.

The aim of the next paragraphs is to underline the major difficulties encountered at each abovestage and the techniques that we used in order to overcome them. Before coming to these points,I precise that several different proofs of the classical Pontryagin maximum principle are known inthe literature. Basically they consist in three steps:

(i) Firstly one has to perform the sensitivity analysis of the state equation with respect to per-turbations of the control u. Hence one obtains variation vectors which are solutions to lin-earized versions of the control system and which generate the so-called Pontryagin cone,serving as a first-order approximation of the reachable set (but without taking into accountof any constraint of Problem (P)).

(ii) Secondly one has to take into account the constraints of Problem (P) (such as the terminalstate constraint ψ(x(a), x(b)) ∈ S for example). To this aim, one can invoke various argu-ments: Brouwer fixed point theorem, implicit function theorem, Ekeland variational princi-ple, etc. At this step one constructs a Lagrange multiplier which is normal to the Pontryagincone (by taking into account, this time, of the constraints of Problem (P)).

(iii) Finally the adjoint vector is constructed by propagating backward in time the above La-grange multiplier, according to the adjoint system of the linearized control system satisfiedby the variation vectors.

In the above step (i), note that various perturbations of the control have been considered in theliterature. Roughly speaking, when considering a convex L∞-perturbation of the control u (of theform uδ := u +δ(v −u) for a small perturbation parameter δ > 0 and a general function v), onecan (only) expect to obtain a weak version of the Pontryagin maximum principle (in which thestandard Hamiltonian maximization condition is replaced by the weaker nonpositive Hamiltonian

46


gradient condition). In order to obtain a strong version of the Pontryagin maximum principle, oneshould opt for L1-perturbation of the control u. For example, the (explicit) needle-like perturbationof the control u, which consists in considering uδ := v over a small interval of length δ and uδ := uelsewhere, belongs to the class of L1-perturbation of the control. I refer to the next paragraphs formore details on these techniques.

First stage in the paper [B12]. In this paragraph we take T1 = T (permanent control) and j = 1and h ≡ −1 (state constraint-free case). In the paper [B12], we encountered technical difficultiesat each step (i), (ii), (iii) which were specific to the consideration of a general time scale setting.

(i) The first difficulty was to select suitable perturbations of the control u in the time scale set-ting. In order to obtain a strong version of the Pontryagin maximum principle, our ideawas to invoke (explicit) needle-like perturbations of the control. Fix a vector v ∈ Rm . At apoint s ∈ [a,b)T∩RD, for all δ> 0 such that s +δ ∈T, we consider the perturbed control uδ

defined by

uδ(τ) :=

v if τ ∈ [s, s +δ)T,u(τ) elsewhere,

for ∆-a.e. τ ∈ [a,b)T. Since s is a right-dense point, we can make δ tend to zero and the vari-ation vector obtained (using a time scale version of the Grönwall lemma [115, Section 6.1])is the unique global solution to the linearized Cauchy problem given by

w∆(t ) =∇x f (x(t ),u(t ), t )w(t ), ∆-a.e. [s,b)T,

w(s) = f (x(s), v, s)− f (x(s),u(s), s).

In contrast, at a point r ∈ [a,b)T ∩RS, the above approach cannot be adapted (since onecannot make δ > 0 tend to zero by preserving r +δ ∈ T). In that situation one can (only)consider a convex L∞-perturbation of the control given by

uδ(τ) :=

u(r )+δ(v −u(r )) if τ= r,u(τ) elsewhere,

for ∆-a.e. τ ∈ [a,b)T. Making δ tend to zero, the variation vector obtained is the uniqueglobal solution to the linearized Cauchy problem given by

w∆(t ) =∇x f (x(t ),u(t ), t )w(t ), ∆-a.e. [σ(r ),b)T,

w(σ(r )) =µ(r )∇u f (x(r ),u(r ),r )(v −u(r )).

This difference in the possible perturbations (or not) of the control u, according to right-dense or right-scattered points of the time scaleT, and thus the difference in the initial con-ditions of the corresponding variation vectors, is at the origin of the different Hamiltonianconditions formulated in Theorem 3.6.

(ii) In order to obtain the above variation vectors at right-dense points, we have used the fol-lowing property: when considering a function z ∈ L1

∆([a,b)T,Rn) and s ∈ [a,b)T∩RD whichis a ∆-Lebesgue point of z, we have

limδ→0+s+δ∈T

1

δ

∫[s,s+δ)T

z(τ)∆τ= z(s). (3.4)

Unfortunately, if we remove the constraint s +δ ∈ T, the above limit fails in general. As aconsequence, since we consider a very general time scale T (which can be a Cantor set forexample), there is no reason to consider that the perturbation parameter δ > 0 lies in aninterval. To the best of our knowledge, this topological obstruction excludes the use of sev-eral standard methods in order to accomplish the step (ii) of the proof of the Pontryagin

47


maximum principle in a general time scale setting. For example, the Brouwer fixed pointargument and the implicit function theorem both require that the set of perturbation pa-rameters is convex. Actually, when starting our collaboration with Trélat, our preliminaryapproaches were based on these arguments until we realized that they are unfruitful in ageneral time scale setting.

In our paper [B12] we opted for an alternative approach based on the Ekeland variationalprinciple [138, Theorem 1.1] which turns out to be suitable in order to avoid the above ob-struction. The idea is to define a penalized functional by adding the square dS[ψ(x(a), x(b))]2

of the distance of the term ψ(x(a), x(b)) to the set S in the cost function (see [B12, Sec-tion 3.1.1]). However the Ekeland variational principle requires to define the penalized func-tional on a complete metric set which has led us to assume that U is closed (see [B12, Sec-tion 3.3.1] for details). Invoking the Ekeland variational principle together with the perturba-tions of the control u mentioned in the above step (i), we were able to construct a Lagrangemultiplier taking into acount the terminal state constraint ψ(x(a), x(b)) ∈ S.

To conclude this step (ii), let me mention that the recent work [114] by Bohner et al. aims atremoving the closedness assumption on U by using an approach based on necessary condi-tions for an extreme in a cone. Unfortunately, as explained above, the authors need that theperturbation parameters lie in intervals. As a consequence, the authors of [114] made somedensity assumptions on the time scale T of the form

limδ→0+s+δ∈T

µ(s +δ)

δ= 0,

at all points s ∈ [a,b)T ∩RD in order to guarantee that Equality (3.4) holds even when re-moving the constraint s +δ ∈T. I refer to the addendum [B17] written in collaboration withTrélat and Stanzhytskyi (who is one of the author of [114]) for more details. Finally, obtaininga time scale version of the Pontryagin maximum principle with no closure assumption on Uand no assumption on the time scale T remains an open challenge.

(iii) Finally, in accordance with the linearized control systems satisfied by the variation vectorsobtained in the step (i), the appropriate adjoint vector p must be defined as the uniqueglobal solution to the adjoint linear differential equation given by

p∆ =−∇x f (x(t ),u(t ), t )>pσ(t ), ∆-a.e. t ∈ [a,b)T,

together with a final condition p(b) related to the Lagrange multiplier constructed in theprevious step (ii). Note that a shift σ is involved in the above adjoint equation. This is dueto the time scale version of the Leibniz formula recalled in Section 3.2. Note that shiftedCauchy problems on time scales have also been studied in our preliminary paper [B13].

Second stage in the paper [B16]. In this paragraph we takeT1 ⊂T (possibly different, so sampled-data control) and j = 1 and h ≡−1 (state constraint-free case). In my second paper [B16] in collab-oration with Trélat, we (only) encountered a technical difficulty at step (i). The steps (ii) and (iii)are similar than in the previous paper [B12].

(i) Fix a vector v ∈ Rm . At points s ∈ [a,b)T1 ∩RD1, our approach is not modified with respectto the previous paper [B12]. In contrast, at a point r ∈ [a,b)T1 ∩RS1, by considering theconvex L∞-perturbation of the control u given by

uδ(τ) :=

u(r )+δ(v −u(r )) if τ= r,u(τ) elsewhere,

48


for ∆1-a.e. τ ∈ [a,b)T1 , we obtained that the associated variation vector is the unique globalsolution to the linearized Cauchy problem given by

w∆(t ) = ∇x f (x(t ),u(r ), t )w(t )+∇u f (x(t ),u(r ), t )(v −u(r )), ∆-a.e. [r,σ1(r ))T,

∇x f (x(t ),u(t ), t )w(t ), ∆-a.e. [σ1(r ),b)T,

w(r ) = 0Rn .

The fact that a nonhomogeneous term emerges on the sampling interval [r,σ1(r ))T is atthe origin of the averaging of the gradient in the Hamiltonian conditions of Theorem 3.6 atpoints r ∈ [a,b)T1 ∩RS1.

Third stage in the work [B22]. In this paragraph, let us consider the statement of Theorem 3.6 inits entirety (in particular in the state constrained case). In general (not only in the time scale andsampled-data control settting), the extension of the Pontryagin maximum principle from stateconstraint-free problems to state constrained problems is not trivial. It requires several adjust-ments at each step (i), (ii) and (iii).

(i) The first difficulty comes from the fact that the perturbations of the control considered in theprevious papers [B12, B16] are (only) local, and thus the corresponding differentiability ofthe state is obtained (only) over the interval [s +ς,b]T for any small ς> 0 (when consideringa perturbation at a point s ∈ [a,b)T1 ∩RD1) and over the interval [r,b]T (when consideringa perturbation at a point r ∈ [a,b)T1 ∩RS1). I refer to [B16, Propositions 2, 3 and 4] for de-tails. However the analysis of the inequality state constraints hi (x(t ), t ) ≤ 0 in Problem (P)requires differentiability of the state over the whole interval [a,b]T. In order to overcomethis technical difficulty, our idea in the work [B22] was to perform the sensitivity analysisof the state equation under implicit spike variations of the control. This concept was usedin [120, 172] for state constrained continuous-time optimal permanent control problems.

In a first place, let us expound the general idea of this concept in the basic framework T =T1 = R+ with (a,b) = (0,T ) for some T > 0. To this aim, fix v ∈ L∞([0,T ],Rm). The idea is toconsider a L1-perturbation of the control u given by

uδ(τ) :=

v(τ) if τ ∈Qδ,u(τ) elsewhere,

for a.e. τ ∈ [0,T ], where Qδ ⊂ [0,T ] is such that µL (Qδ) = δT and such that the correspond-ing variation vector is the unique global solution to the linearized Cauchy problem givenby

w(t ) =∇x f (x(t ),u(t ), t )w(t )+ f (x(t ), v(t ), t )− f (x(t ),u(t ), t ), a.e. t ∈ [0,T ],

w(0) = 0Rn .

The existence of such a subset Qδ ⊂ [0,T ] is highly nontrivial. The proof can be foundin [172, p.143] and is strongly based on the fact that the Lebesgue measure µL is nonatomicand on the Sierpinski theorem (see, e.g., [191] or [144, p.37]).

Now let us discuss the adaptation of the above method to the context of the work [B22].Firstly, note that the above variation vector does not make intervene the gradient ∇u f . Sincewe know that the Hamiltonian maximization condition does not hold true in our generaltime scale and sampled-data control setting (and needs to be replaced by a nonpositive av-eraged Hamiltonian gradient condition), we do expect that the above method cannot beexactly adapted. First of all, let us emphasize that the Lebesgue ∆1-measure µ∆1 is atomic(since the ∆1-measure of any right-scattered point of T1 is positive). To overcome this dif-ficulty, we introduced in [B22, Section 5.1.2] a suitable time scale version of the concept

49


of implicit spike variations, by distinguishing the perturbation of the control at right-densepoints and at right-scattered points of [a,b)T1 . Precisely, fix v ∈ L∞

∆1([a,b)T1 ,Rm). The idea is

to consider a perturbation of the control u given by

uδ(τ) :=

v(τ) if τ ∈Qδ,u(τ) if τ ∈ RD1\Qδ,u(τ)+δ(v(τ)−u(τ)) if τ ∈ RS1,

for ∆1-a.e. τ ∈ [a,b)T1 , where Qδ ⊂ [a,b)T1 ∩RD1 is such that µ∆1 (Qδ) = δµ∆1 ([a,b)T1 ∩RD1)and such that the variation vector obtained is the unique global solution to the linearizedCauchy problem given by w(t ) =∇x f (x(t ),u(t ), t )w(t )+ξ(t ), ∆-a.e. t ∈ [a,b)T,

w(a) = 0Rn ,

where

ξ(t ) := f (x(t ), v(t ), t )− f (x(t ),u(t ), t ) if t ∈ RD1,

∇u f (x(t ),u(r ), t )(v(r )−u(r )) if t ∈ [r,σ1(r ))T for some r ∈ [a,b)T1 ∩RS1.

Here the construction of Qδ is possible because we took Qδ ⊂ [a,b)T1 ∩RD1 and since theLebesgue∆1-measureµ∆1 is nonatomic over RD1. I refer to [B22, Section 5.1.2] for all details.

(ii) The state constraints hi (x(t ), t ) ≤ 0 in Problem (P) can easily be rewritten as h(x) ∈S, whereh : C([a,b]T,Rn) → C([a,b]T,R j ) and S := C([a,b]T,R j−) (see [B22, Section 5.2] for details). Inthe same spirit than the step (ii) in the previous papers [B12, B16], our idea was to add thesquare dS[h(x)]2 of the distance of the term h(x) to the set S in the Ekeland penalized func-tional. Unfortunately, in contrast to d2

S in the finite-dimensional case, it turns out that d2S

does not enjoy nice differentiability properties when endowing C([a,b]T,R j ) with its usualuniform norm ‖ · ‖∞. To overcome this difficulty, since (C([a,b]T,R j ),‖ · ‖∞) is a separableBanach space, we can endow C([a,b]T,R j ) with an equivalent norm such that the corre-sponding dual norm is strictly convex (see [172, Theorem 2.18]). In that context we obtainedthat d2

S is Fréchet-differentiable on S and is strictly Hadamard-differentiable outside of S(see [179, Theorem 3.54]). Then, invoking the Ekeland variational principle together withthe perturbations of the control mentioned in the above step (i), we were able to construct aLagrange multiplier taking into acount the terminal state constraintψ(x(a), x(b)) ∈ S and theconstraint h(x) ∈S of Problem (P). In particular I emphasize that the Lagrange multipliernaturally involves a term belonging to the dual space of C([a,b]T,R j ), and more precisely tothe normal cone NS[h(x∗)]. From this normality and from the Riesz representation theorem(see [186, Theorem 2.14]), we were able to characterize this dual term as finite nonnegativeBorel measures dηi satisfying the complementary slackness condition given in Theorem 3.6.

(iii) Finally, in accordance with the linearized control systems satisfied by the variation vectorsobtained in the step (i) but also with the emergence of the finite nonnegative Borel mea-sures dηi in the Lagrange multiplier, the appropriate AC-adjoint vector p must be definedas the unique global solution to the adjoint linear differential equation given by

p∆ =−∇x f (x(t ),u(t ), t )>(

pσ(t )+j∑

i=1

∫[a,t ]T

∇x hi (x∗(τ),τ) dηi (τ)

), ∆-a.e. t ∈ [a,b)T,

together with a final condition p(b) related to the Lagrange multiplier. The proof is con-cluded by defining the BV-adjoint vector q as in Theorem 3.6.

I refer to [B22, Section 6] for the complete and detailed proof of Theorem 3.6.

50


3.3.6 Perspectives

Several extensions should be envisaged regarding the sampling procedure. Indeed, in the presentchapter (and in the works [B16, B22]), we have considered a sample-and-hold procedure (see Def-inition 3.1) in order to recover the usual case of sampled-data controls in continuous-time prob-lems (roughly speaking withT=R+ andT1 =N). The priority in my future works will be to explorethe wide panel of possible sampling procedures, among which: a state-dependence sampling pro-cedure on the control (in view of dealing with state areas of noncontrol for example), a samplingprocedure on the state (in view of dealing with hybrid or delay systems for example) and/or onthe state constraint (in view of dealing with nonpermanent state constraints for example), etc.The time scale theory, combined with sampling procedures, offers a large panorama of possibleperspectives in optimal control theory. Actually this is also true for the more general control the-ory. Indeed one might consider studying questions related to the controllability of sampled-datacontrol systems on time scales (such as the extension of the well known Kalman condition forexample).

I conclude this section with two natural issues which will be addressed in the next chapters. Con-sider Problem (P) with T = R+ and take (a,b) = (0,T ) for some T > 0 and consider the stateconstraint-free case (that is, take j = 1 and h ≡ −1). Then recall that a N -partition of the inter-val [0,T ] is a finite set P := tk k=0,...,N such that

0 = t0 < t1 < . . . < tN−1 < tN = T,

where N ∈N∗. We denote by ‖P‖ := maxk=0,...,N−1 |tk+1 − tk | the diameter of the partition P .

• Fix a N -partition P := tk k=0,...,N of the interval [0,T ]. Consider T1 = P ∪ [T,+∞). In thatsituation, the times tk are called the sampling times and they correspond to the times atwhich (and only at which) the value of the control can be modified. In the present chap-ter (and in the works [B16, B22]), we have dealt with the case where the sampling times tk

are fixed. In that context, applying Theorem 3.6, we have seen in Remark 3.16 that the maxi-mized Hamiltonian function H is discontinuous in general. In the next Chapter 4 (extractedfrom the paper [B21] written in collaboration with Dhar), we will consider the case where thesampling times tk are free and thus are parameters to optimize. In that context we will seethat the corresponding necessary optimality condition coincides with the continuity of themaximized Hamiltonian function H .

• Assume that Problem (P) with T1 = R+ (permanent control) admits a solution (x∗,u∗) andlet us denote by (p,λ) the couple provided in Theorem 3.6. Now assume that, for all par-titions P of the interval [0,T ], Problem (P) with T1 = P ∪ [T,+∞) (sampled-data control)admits an optimal solution (x∗

P ,u∗P ) and let us denote by (pP ,λP ) the couple provided in

Theorem 3.6. The following natural question emerges: what can be said about the conver-gence of the elements x∗

P , u∗P , pP and λP when the diameter ‖P‖ of the partition P tends to

zero? This question will be addressed in Chapter 5 (for unconstrained linear-quadratic opti-mal sampled-data control problems) and in Chapter 6 (in a more general nonlinear settingand with terminal state constraints and control constraints).

3.4 The observation of a bouncing trajectory phenomenon

This section is essentially extracted from the paper [B21] written in collaboration with Dhar. Let usfix some T > 0 and P = tk k=0,...,N a N -partition of the interval [0,T ]. In this section we will focus

51


on the state contrained continuous-time optimal sampled-data control problem given by

minimize g (x(T ))+∫ T

0L(x(τ),u(τ),τ) dτ,

subject to x ∈ AC([0,T ],Rn), u ∈ PCP ([0,T ],Rm),

x(t ) = f (x(t ),uk , t ), a.e. t ∈ [tk , tk+1), ∀k = 0, . . . , N −1,

x(0) = x0,

h(x(t ), t ) ≤ 0, ∀t ∈ [0,T ],

uk ∈ U, ∀k = 0, . . . , N −1,

(P1)

where x0 ∈Rn is fixed and where PCP ([0,T ],Rm) stands for the set of piecewise constant functionsdefined on [0,T ] with values in Rm respecting the partition P , that is

PCP ([0,T ],Rm) := u : [0,T ] →Rm | ∀k = 0, . . . , N −1, ∃uk ∈Rm , ∀t ∈ [tk , tk+1), u(t ) = uk .

In other words, Problem (P1) corresponds to a particular case of Problem (P) in which T = R+and T1 = P ∪ [T,+∞) with (a,b) = (0,T ). Furthermore ` = 2n, ψ is the identity function and S =x0×Rn . In that context note that the Mayer cost function g depends only on the final state x(T ).Finally, for simplicity, only one inequality state constraint h(x(t ), t ) ≤ 0 is considered (that is, j = 1).

When Dhar and myself undertook the study of Problem (P1) in 2018, one of our first actions wasto solve numerically some simple examples by using a direct numerical method (see Section 3.4.3for some details). At this occasion we observed that the optimal trajectories returned by the algo-rithm had a common behavior with respect to the inequality state contraint. Precisely the optimaltrajectories were “bouncing" on it. I refer to Figure 3.3 for an illustration of this behavior whichwe refer to as the bouncing trajectory phenomenon. Actually, this phenomenon concerns, not onlythe optimal trajectories, but all admissible trajectories.

In this section my aim is to give a detailed description of this new observation (which does nothold in general in classical state constrained continous-time optimal permanent control prob-lems). Precisely we will show that, under certain (quite unrestrictive) hypotheses, an admissibletrajectory of Problem (P1) necessarily bounces on the inequality state constraint and, moreover,the activating times occur at most at the sampling times tk (and thus in a finite number and atprecise instants). As detailed later in Section 3.4.3, this feature presents some benefits from a nu-merical point of view in indirect methods.

3.4.1 Heuristical discussion: the expected behavior of an admissible trajectory

We start this section by recalling some standard terminology from [155, p.183] or [190, p.105].Let x be an admissible trajectory of Problem (P1). An element t ∈ [0,T ] is called an activating timeif it satisfies h(x(t ), t ) = 0. An interval [τ1,τ2] ⊂ [0,T ], with τ1 < τ2, is called a boundary intervalif h(x(t ), t ) = 0 for all t ∈ [τ1,τ2]. Note that any point of a boundary interval is an activating time,while the reverse is not true in general. In what follows, we say that the trajectory x exhibits thebouncing trajectory phenomenon if the set of activating times contains no boundary interval.

My aim in this section is to give some heuristic descriptions (and illustrative figures) of the mainreason why a bouncing trajectory phenomenon is usually displayed when dealing with sampled-data controls in the presence of an inequality state constraint (see (i) below) and why, moreover,the activating times occur at most at the sampling times tk (see (ii) below). The rigorous mathe-matical justifications will be provided in the next Section 3.4.2.

(i) In state constrained continuous-time optimal permanent control problems, a boundary in-terval may correspond to a feedback control, that is, to an expression of the control as a

52


function of the state. Such an expression usually leads to a nonconstant control. More gen-erally, an inequality state constraint usually cannot be activated by a trajectory on an inter-val [τ1,τ2] with τ1 < τ2 on which the associated (permanent) control is constant. We referto Figure 3.1 for an illustration. Therefore, since we deal with piecewise constant controlsin Problem (P1), one should expect that an admissible trajectory of Problem (P1) does notpossess any boundary interval and thus displays a bouncing trajectory phenomenon. Inorder to guarantee the validity of this remark, it is sufficient to make an assumption on fand h which prevents the existence of an admissible trajectory x of Problem (P1) and of aninterval [τ1,τ2] ⊂ [0,T ] with τ1 < τ2 for which ϕ(k)(t ) = 0 for all k ∈ N and all t ∈ [τ1,τ2],whereϕ is defined byϕ(t ) := h(x(t ), t ) for all t ∈ [0,T ]. This will be done in Section 3.4.2 (seeAssumption (A1)).

inequalitystate constraint

x

Usually the control is not

constant along a bound-

ary interval

Figure 3.1 – In state constrained continuous-time optimal permanent control problems, a boundary inter-val is usually associated to a nonconstant control.

(ii) Let t ∈ [0,T ] be a left isolated (resp. right isolated) activating time of an admissible trajec-tory x of Problem (P1). We denote by u the corresponding piecewise constant control. Letus assume that t is not a sampling time, that is, t ∈ (tk , tk+1) for some k ∈ 0, . . . , N −1. Usu-ally the trajectory x “hits" the inequality state constraint transversely at t . Since the con-trol value u(t ) = uk is fixed all along the sampling interval [tk , tk+1], the trajectory x then“crosses" the inequality state constraint immediately after t , which contradicts the admissi-bility of x. We refer to Figure 3.2 for an illustration. Hence, in order to preserve the admissi-bility of x, we understand that the control value must change at t , that is, since u is piecewisecontant, that t must be one of the sampling times tk . From this simple heuristic discussion,one should expect that an admissible trajectory of Problem (P1) has no left or right isolatedactivating time outside of the sampling times tk . In order to guarantee the validity of thisremark, it is sufficient to make an assumption on f and h which prevents the existence ofan admissible trajectory of Problem (P1) which “hits" the inequality state constraint tangen-tially. This will be done in Section 3.4.2 (see Assumption (A2)). Actually Assumption (A2) willeven guarantee that an admissible trajectory of Problem (P1) has no activating time outsideof the sampling times tk .

We conclude from (i) and (ii) that one should expect the admissible trajectories of Problem (P1) toexhibit the bouncing trajectory phenomenon and, moreover, such that the activating times occurat most at the sampling times tk (and thus in a finite number and at precise instants). We referto Figure 3.3 for an illustration of this feature. Note that, even if activating times are all samplingtimes, the reverse is not true in general.

I conclude this section by mentioning that the above descriptions are only heuristic and, of course,one can easily find counterexamples in which the behavior of Figure 3.3 is not observed. Nonethe-less I would like to emphasize that the bouncing trajectory phenomenon is quite ordinary whendealing with sampled-data controls in continuous-time systems in the presence of inequality state

53



x

Usually the trajec-

tory “hits" the in-

equality state con-

straint transversely

Keeping the same

control value u(t ) =uk , the trajectory

“crosses" the inequal-

ity state constraint

ttk tk+1

Figure 3.2 – Illustration of an admissible trajectory x hitting transversely the inequality state constraint atsome left isolated activating time t which belongs to the interior (tk , tk+1) of a sampling interval.


x

tk−2 tk−1 tk tk+1tk−3 tk+3tk+2

Figure 3.3 – Illustration of the expected behavior of an admissible trajectory x of Problem (P1).

constraints, as guaranteed by the mathematical justifications provided in the next Section 3.4.2and as illustrated by an example numerically solved in Section 3.4.3 (see two other examplesin [B21, Section 5]).

3.4.2 Mathematical justifications: a sufficient condition for bouncing trajectories

In [B21, Proposition 4.1] (which is recalled in Proposition 3.18 below), we have formulated a suf-ficient condition ensuring the bouncing trajectory phenomenon and that the rebounds occur atmost at the sampling times tk . Throughout this section we assume that the dynamics f and theinequality state constraint function h are of class C∞ in all variables.

Similarly to [155, p.183], we introduce the functions h[k] : Rn ×Rm × [0,T ] → R defined by the in-duction

h[0](x,u, t ) := h(x, t ),

∀k ∈N, h[k+1](x,u, t ) := ⟨∇x h[k](x,u, t ), f (x,u, t )⟩Rn +∇t h[k](x,u, t ),

for all (x,u, t ) ∈Rn ×Rm × [0,T ]. We introduce the subset

A := (x, t ) ∈Rn × ([0,T ]\P ) | h(x, t ) = 0,

and we denote byk ′(x,u, t ) := mink ∈N | h[k](x,u, t ) 6= 0 ∈N∗∪ +∞,

for all (x, t ) ∈A and all u ∈ U. Finally we introduce the set

B(x, t ) := v ∈ U | k ′(x,u, t ) is finite and even,

for all (x, t ) ∈A .

54


Proposition 3.18. If the assumptions

∀(x, t ) ∈A , ∀u ∈ U, k ′(x,u, t ) <+∞, (A1)

and∀(x, t ) ∈A , ∀u ∈B(x, t ), h[k ′(x,u,t )](x,u, t ) > 0, (A2)

are both satisfied, then the activating times of all admissible trajectories of Problem (P1) are sam-pling times.

Remark 3.19. Proposition 3.18 can be found in [B21, Proposition 4.1] and its proof is based onbasic Taylor expansion formulas. I emphasize that Assumptions (A1) and (A2) are the exact hy-potheses which guarantee the validity of the arguments presented heuristically in the items (i)and (ii) of Section 3.4.1.

3.4.3 Numerical experiments based on an indirect method

Two prevalent numerical methods are known in classical state constraint-free continuous-timeoptimal permanent control theory. The first kind is usually called direct methods and they con-sist in making a full discretization of the optimal control problem which results into a constrainedfinite-dimensional optimization problem that can be numerically solved from standard optimiza-tion algorithms. The second strategy is called indirect methods because they are based on thePontryagin maximum principle. Precisely, if the Hamiltonian maximization condition allows toexpress the optimal permanent control u∗ as a function of the state x∗ and of the (absolutelycontinuous) adjoint vector p, then the indirect methods consist in the numerical resolution by ashooting method of the boundary value problem satisfied by the augmented vector (x∗, p). Recallthat neither direct nor indirect methods are fundamentally better than the other. I refer to [195,p.170-171] for details and discussions on the advantages and drawbacks of each method.

In the state constrained case, direct methods can easily be adapted. In contrast, implementingindirect methods in that context is more intricate since the adjoint vector q is not absolutelycontinuous in general, but (only) of bounded variation. From the Lebesgue decomposition (see,e.g., [125, Corollary 20.20]), we can write

q = qac +qsc +qs ,

where qac is the absolutely continuous part, qsc is the singularly continuous part and qs is thesaltus or pure jump part of q . From the complementary slackness condition, it is known that theadjoint vector q is absolutely continuous outside the activating times of x∗. On the other hand,on boundary intervals, the adjoint vector q may have an infinite number of unlocalized jumpsor a pathological behavior due to its singular part. As a consequence, an important part of theliterature is devoted to the analysis of the behavior of the adjoint vector q and some constraintqualification conditions have been established. I refer for instance to [110, 121, 155, 163, 175].

In [B21, Section 5], our aim was to propose an indirect method for solving numerically state con-strained continuous-time optimal sampled-data control problems based on Theorem 3.6. Intherethe adjoint vector q is also a function (only) of bounded variation and one should encounter apriori the same difficulties outlined above. Nevertheless we have proved in Proposition 3.18 that,under (quite unrestrictive) Assumptions (A1) and (A2), the optimal trajectory x∗ of Problem (P1)activates the inequality state constraint at most at the sampling times tk . It follows that the adjointvector q has no singular part and admits a finite number of jumps which are localized at most atthe sampling times tk . Taking advantage of this knowledge, we were able in [B21, Section 5] to pro-pose a “simple" indirect numerical method recalled in the next paragraph. Then, in the followingparagraph, this indirect method is implemented in order to solve numerically a simple example(extracted from [B21, Section 5]). I precise that two additional examples have been numericallysolved in [B21, Section 5].

55


An indirect numerical method. Let (x∗,u∗) be a solution to Problem (P1). We denote by λ, dηand q the elements provided in Theorem 3.6. In what follows we assume that the case is nor-mal and we normalize λ = 1 (see Remark 3.8) and we assume that Assumptions (A1) and (A2)are satisfied. As a consequence, it follows from Proposition 3.18 that x∗ activates the inequalitystate constraint at most at the sampling times tk . From the complementary slackness conditionin Theorem 3.6, we deduce that dη is the sum of (N + 1) nonnegative Dirac functions with sup-ports localized exactly at the sampling times tk . We denote the corresponding values by dη[k] forall k = 0, . . . , N . From the definition of the adjoint vector q in Theorem 3.6, it follows that q has nosingular part, that it admits (N +1) jumps localized exactly at the sampling times tk given by

q [k] := dη[k]∇x h(x∗(tk ), tk ),

for all k = 0, . . . , N , and that q remains absolutely continuous elsewhere. The indirect numericalmethod proposed in [B21, Section 5] is based on the shooting map(

x(T ), (dη[k])k=0,...,N

)7−→

(x(0)−x0,

(dη[k]h(x(tk ), tk )

)k=0,...,N

),

where:

(i) we provide a guess of the final value x(T ) and of the nonnegative Dirac values dη[k] for allk = 0, . . . , N ;

(ii) we compute q(T ) =−∇g (x(T ));

(iii) we solve numerically the state equation and the adjoint equation in a backward way (from t =T to t = 0), by using the nonpositive averaged Hamiltonian gradient condition in order tocompute the control values uk for all k = 0, . . . , N −1;

(iv) we finally compute x(0)−x0 and dη[k]h(x(tk ), tk ) for all k = 0, . . . , N .

Then we used the MATLAB function fsolve in order to find the zeros of the above shooting map.As illustration of the above indirect numerical method, we solve a simple example in the nextparagraph.

Example: an optimal consumption problem with an affine inequality state constraint. Weconsider the problem given by

minimize∫ 12

0(u(τ)−1)x(τ)dτ

subject to x ∈ AC([0,12],R), u ∈ PCP ([0,12],R),

x(t ) = uk x(t ), a.e. t ∈ [tk , tk+1), ∀k = 0, . . . , N −1,

x(0) = 1,

x(t )−10t −2 ≤ 0, ∀t ∈ [0,12],

uk ∈ [0,1], ∀k = 0, . . . , N −1,

(Ex1)

where P is a fixed uniform N -partition of the interval [0,12]. This problem corresponds to a classi-cal optimal consumption problem (see, e.g., [139, p.5]) revisited with a sampled-data control andan affine inequality state constraint. After some simple computations (see [B21, Section 5.3] fordetails), we verify that Problem (Ex1) satisfies Assumptions (A1) and (A2). We now apply the indi-rect numerical method presented in the previous paragraph. As expected we observe in Figure 3.4(with N = 4) that the optimal trajectory returned by the algorithm activates the inequality stateconstraint at most at the sampling times tk (represented by dashed lines). As also expected, thejumps of the adjoint vector q exactly occur at the same activating times. Figure 3.5 continues toillustrate this bouncing trajectory phenomenon with N = 6.

56


3.5 Application to continuous-time min-max optimal sampled-data con-trol problems

This section is extracted from my work [B22, Section 4] in collaboration with Bettiol. Let us fixsome T > 0 and P = tk k=0,...,N a N -partition of the interval [0,T ]. In this section we will focus onthe general continuous-time min-max optimal sampled-data control problem given by

minimize maxt∈[0,T ]

F (x(t ), t ),



x(0) = x0,

x(T ) = x f ,

uk ∈ U, ∀k = 0, . . . , N −1,

(MMP)

where x0, x f ∈Rn are fixed and where F :Rn × [0,T ] →R is a given real function which is assumedto be continuous and of class C1 in its first variable. Following a well known idea (developed forexample in [135, Remark 6] or in [197, p.351]), one can reformulate Problem (MMP) as a stateconstrained continuous-time optimal sampled-data control problem, that is, as a particular caseof Problem (P). Applying Theorem 3.6, we obtained in [B22, Proposition 2] the next result.

Proposition 3.20. If (x∗,u∗) ∈ AC([0,T ],Rn)×PCP ([0,T ],Rm) is a solution to Problem (MMP), thenthere exist λ≥ 0, p ∈ AC([0,T ],Rn) and a finite nonnegative Borel measure dη on [0,T ] such that thefollowing conditions are satisfied:

(i) Nontriviality: (p,λ,dη) 6= 0;


−p(t ) =∇x f (x∗(t ),u∗k , t )>×q(t ), a.e. t ∈ [tk , tk+1), ∀k = 0, . . . , N −1;

(iii) Transversality condition: dη([0,T ]) =λ;

(iv) Nonpositive averaged Hamiltonian gradient condition:⟨∫ tk+1

tk

∇u f (x∗(τ),u∗k ,τ)>×q(τ) dτ, v −u∗

k

⟩Rm

≤ 0,

for all v ∈ U and all k = 0, . . . , N −1;

(v) Complementary slackness condition:

supp(dη) ⊂ t ∈ [0,T ] | F (x∗(t ), t ) = F∗,

where F∗ := maxt∈[0,T ] F (x∗(t ), t ).

Here q ∈ BV([0,T ],Rn) is defined by

∀t ∈ [0,T ], q(t ) := p(t )+∫

[0,t ]∇x F (x∗(τ),τ) dη(τ).

As illustration of Proposition 3.20, we have numerically solved in [B22, Section 4.2] a maximal ve-locity minimization problem of the continuous-time harmonic oscillator with sampled-data con-trol. This problem can be reformulated as the continuous-time min-max optimal sampled-data

57


control problem given by

minimize maxt∈[0,2]

x2(t ),

subject to x = (x1, x2) ∈ AC([0,2],R2), u ∈ PCP ([0,2],R),(x1(t )x2(t )

)=

(x2(t )

−x1(t )+uk

), a.e. t ∈ [tk , tk+1), ∀k = 0, . . . , N −1,(

x1(0)x2(0)

)=

(00

),(

x1(2)x2(2)

)=

(10

),

uk ∈ [−1,1], ∀k = 0, . . . , N −1,

(Ex2)

where P is a fixed uniform N -partition of the interval [0,2]. Note that Problem (Ex2) is a particularcase of Problem (MMP). In [B22, Section 4.2], we applied Proposition 3.20 to Problem (Ex2) and wewere able to adapt the indirect numerical method presented in Section 3.4.3 to the present context.Nevertheless I would like to emphasize that the situation is more intricate here. Indeed, despitethat we were able to prove that the optimal trajectory displays a bouncing trajectory phenomenon,it cannot be proved that the activating times only occur at the sampling times tk . Nevertheless, wewere able to prove in [B22, Section 4.2] that there is at most one activating time on each samplinginterval (tk , tk+1). In particular the activating times are in a finite number (at most 2N+1), but theyare not localized. Nevertheless we were able to adapt the indirect numerical method by adding ininputs the guesses of the activating times. Figure 3.6 displays the numerical results obtained inthe case N = 10. Note that the optimal trajectory given in Figure 3.6 displays a bouncing trajec-tory phenomenon. Figure 3.7 provides a zoom on the behavior of x∗

2 with respect to the maximalvalue V ∗ := maxt∈[0,2] x∗

2 (t ). One can see that V ∗ ' 0.6929 is attained a finite number of times (ex-actly four) which are not sampling times, and they exactly correspond to the discontinuity jumpsof the BV adjoint vector q2.

58


3.6 Figures

Figure 3.4 – Example 1 with N = 4.


59


0 0.5 1 1.5 2

0

0.5

1

state x*1

0 0.5 1 1.5 2

0

0.2

0.4

0.6

0.8

state x*2

maximal velocity V*=0.6929

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

state (x*1,x*

2)


0 0.5 1 1.5 2

-1

-0.5

0

0.5

1

control u*

0 0.5 1 1.5 2

0.8

0.9

1

AC adjoint vector p1

0 0.5 1 1.5 2-1.5

-1

-0.5

0

0.5

AC adjoint vector p2

0 0.5 1 1.5 2

0.8

0.9

1

BV adjoint vector q1

0 0.5 1 1.5 2

-0.4

-0.2

0

0.2

0.4

0.6

BV adjoint vector q2


0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6

0.66

0.67

0.68

0.69

0.7

0.71

0.72

state x*2


Figure 3.7 – Example 2: Zoom on the state x∗2 in the case N = 10.

60

Chapter 4

Optimal sampled-data control problemswith free sampling times and applicationto functional electrical stimulations inmedicine


4.2 Optimal sampled-data control problems with free sampling times . . . . . . . . . 63

4.2.1 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.2.2 Comments on the proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . 65

4.2.3 Comments on the continuity of the maximized Hamiltonian function . . . . 66

4.2.4 Numerical illustrations with a simple linear-quadratic example . . . . . . . 67

4.2.5 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.3 Application to optimal muscular force response to functional electrical stimu-lations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.3.1 Ding et al. force-fatigue model . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.3.2 A general rewritting of the model and necessary optimality conditions . . . 71

4.3.3 Preliminary numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.3.4 Figures and tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

The present chapter summarizes the contributions of the two following references:

• [B19]: L. Bourdin and G. Dhar. Continuity/constancy of the Hamiltonian function in a Pon-tryagin maximum principle for optimal sampled-data control problems with free samplingtimes. Math. Control Signals Systems, 31(4):503–544, 2019.

• [B20]: T. Bakir, B. Bonnard, L. Bourdin, and J. Rouot. Pontryagin-type conditions for optimalmuscular force response to functional electrical stimulations. J. Optim. Theory Appl. (toappear), 2019.

61

CHAPTER 4. OPTIMAL SAMPLED-DATA CONTROL PROBLEMS WITH FREE SAMPLING TIMESAND APPLICATION TO FUNCTIONAL ELECTRICAL STIMULATIONS IN MEDICINE

4.1 Introduction

In the previous Chapter 3, we have stated in Theorem 3.6 a Pontryagin Maximum Principle (PMP inshort) that can handle continuous-time optimal sampled-data control problems by taking T=R+with (a,b) = (0,T ) for some T > 0 and T1 = P ∪ [T,+∞) where P = tk k=0,...,N is a fixed N -partitionof the interval [0,T ]. In that context one can (only) deal with fixed sampling times tk , and thuscannot deal with their optimization.

Contributions of the paper [B19]. The main objective of my first work [B19] with my PhD stu-dent Gaurav Dhar was to derive a PMP that can handle, not only sampled-data controls, but alsofree sampling times. I mention that optimal sampling times problems have already been inves-tigated in the literature but, to the best of my knowledge, never from a PMP point of view. Forexample several authors (such as in [171, 177]) consider the related problem of finding the opti-mal diameter for a uniform partition of the infinite time horizon interval [0,+∞). Nonuniformpartitions have also been studied but in specific cases such as for the linear-quadratic integratorin [188, 189].

In [B19, Theorem 2.1] we obtained a PMP for (state constraint-free) continuous-time optimalsampled-data control problems with free sampling times. This result (which is recalled in The-orem 4.3 below) is the central topic of the present chapter. Its proof and the major difficultiesencountered are commented in Section 4.2.2. Similarly to Theorem 3.6, we obtained a necessaryoptimality condition described by a nonpositive averaged Hamiltonian gradient condition. Fur-thermore, from the freedom of choosing sampling times, we get a new and additional necessaryoptimality condition which happens to coincide with the continuity of the maximized Hamilto-nian function H (see definition in Remark 3.16 of the previous chapter). I refer to Section 4.2.3 fora detailed discussion on that feature.

Recall that, in the classical case of (state constraint-free) continuous-time optimal permanent con-trol problems, the continuity of the maximized Hamiltonian function H is a very well known fact(see, e.g., [141, Theorem 2.6.3]). This classical property does not hold in general when consideringsampled-data controls with fixed sampling times (see Remark 3.16). However note that the workprovided in [B19] shows that this continuity property is recovered when considering sampled-datacontrols with optimal sampling times. This discussion is illustrated with a simple linear-quadraticexample in Section 4.2.4 which is numerically solved by using a shooting method based on thiscontinuity property. Finally some perspectives are listed in Section 4.2.5.

Contributions of the paper [B20]. In 2018, I was contacted by Toufik Bakir and Bernard Bonnardfrom the University of Dijon (France) and Jérémy Rouot from EPF engineering school of Troyes(France) in order to join their applied research group working on muscular reeducation by elec-trical impulses. The resulting paper [B20], written all four together, fits in an industrial researchproject whose aim is to design a smart electrical muscle stimulator. Informations on the practicalissues about this project can be found in [108, 199].

Predicted muscular force response to functional electrical stimulations (in short, FES) is utilized inbiomechanics for muscular reeducation and in case of paralysis. A simplified model (see [131, 132,169]) is a non-fatigue model derived from the Hill equation and used in the context of biochemistryand pharmacology [146]. More complete models (see [133, 134]), taking into account the musclefatigue, were obtained recently in the framework of model identification and produce a dynamicsdescribed by a set of five differential equations. In the paper [B20] we used the so-called Dinget al. force-fatigue model [134] (see Section 4.3.1 for details) and our main concern was to dealwith the mathematical computation of optimized electrical pulses trains (for example in view ofmaximizing the final force response).

In the Ding et al. force-fatigue model, the physical FES input is modeled by Dirac pulses and isintegrated using a linear dynamics, which leads to a sampled-data control system, where the con-

62


trol parameters are the pulses amplitudes and the pulses times. In the paper [B20] we have con-sidered the problem of minimizing a general cost function of Mayer form which depends on thefinal values of the force response and of the fatigue variables. The resulting optimization prob-lem is clearly related to optimal sampled-data control problems with free sampling times, but itdoes not fit exactly with the framework studied in [B19]. I refer to Section 4.3.2 for the details. Asa consequence our main contribution in the paper [B20] was to adapt the techniques of [B19] toour specific problem in order to derive the corresponding Pontryagin-type conditions (see Theo-rem 4.8 in Section 4.3.2 of the present chapter).

As illustration we proposed in [B20, Section 4] some preliminary numerical simulations in thecontext of maximization of the final force response (with fixed pulses amplitudes) determiningthe corresponding optimal pulses times. Precisely we first implemented a direct method usingBOCOP software [119], which allowed us, in a second time, to initialize and implement an in-direct method (shooting method) based on the Pontryagin-type conditions derived in our mainresult [B20, Theorem 3.1] and using HAMPATH software [127]. I refer to Section 4.3.3 for thesenumerical illustrations.

4.2 Optimal sampled-data control problems with free sampling times

The whole section is extracted from the paper [B19] jointly with Dhar. Let m, n, `, N ∈N∗ be fourpositive integers and T > 0 be fixed. In what follows we will preserve all notations, all assumptionsand all terminology introduced in Section 3.3.2 of the previous chapter, and we will focus on thegeneral (state constraint-free) continuous-time optimal sampled-data control problem (Q) givenby

minimize g (x(0), x(T ))+∫ T


subject to P = tk k=0,...,N ∈PN , x ∈ AC([0,T ],Rn), u ∈ PCP ([0,T ],Rm),


ψ(x(0), x(T )) ∈ S,

uk ∈ U, ∀k = 0, . . . , N −1,

(Q)

where PN stands for the set of all N -partitions of the interval [0,T ], that is

PN := P = tk k=0,...,N | 0 = t0 < t1 < . . . < tN−1 < tN = T ,

and where, for all P ∈PN , the set PCP ([0,T ],Rm) stands, as in Section 3.4 of Chapter 3, for the setof piecewise constant functions defined on [0,T ] with values in Rm respecting the partition P .

In the previous Chapter 3, we have discussed at several occasions Problem (P) in the case whereT=R+ with (a,b) = (0,T ) and T1 = P ∪ [T,+∞), where P ∈ PN is a fixed N -partition of the inter-val [0,T ] (see Section 3.4 for example). The major difference in the above Problem (Q) is that the N -partition P is not fixed. We say that the sampling times tk are free and thus they become N −1 pa-rameters to optimize. We say that Problem (Q) is an optimal sampled-data control problem withfree sampling times.

A triplet (P, x,u), with P ∈ PN and (x,u) ∈ AC([0,T ],Rn)×PCP ([0,T ],Rm), is said to be admissi-ble for Problem (Q) if it satisfies all its constraints. A solution to Problem (Q) is an admissibletriplet (P∗, x∗,u∗) which minimizes the Bolza cost among all admissible triplets. In that case thecorresponding optimal sampling times are denoted by t∗k .

Remark 4.1. The set of all piecewise constant functions over [0,T ] with values in Rm respecting atleast one N -partition is given by

PCN ([0,T ],Rm) := ⋃P∈PN

PCP ([0,T ],Rm).

63


Note that Problem (Q) can be rewritten by replacing “u ∈ PCP ([0,T ],Rm)" in the second line by“u ∈ PCN ([0,T ],Rm)" and by removing “P = tk k=0,...,N ∈ PN ”. I take this occasion to emphasizethat the control set PCN ([0,T ],Rm) is not a linear space, neither a convex set. In view of estab-lishing a Pontryagin maximum principle for Problem (Q), this lack of convexity constitutes a dif-ficulty when invoking the Ekeland variational principle together with convex L∞-perturbations ofthe control, in contrast to the case where the N -partition P ∈ PN is fixed (in which the controlset PCP ([0,T ],Rm) is indeed a linear space). I refer to Section 4.2.2 for details.

Remark 4.2. In the paper [B19] with Dhar, we have considered the possibility of taking the finaltime T > 0 being free. In that context (which will not be discussed in the present manuscript), theMayer cost function g and the terminal state constraint function ψ can possibly depend on T .

4.2.1 Main result

The main contribution of the paper [B19], written in collaboration with Dhar, was to state a Pon-tryagin maximum principle for Problem (Q) (which is recalled in Theorem 4.3 below). This re-sult allows to handle, not only sampled-data controls, but also free sampling times. In that con-text, a new necessary optimality condition is derived (see item (v) in Theorem 4.3) which happensto coincide with the continuity of the maximized Hamiltonian function H (see definition in Re-mark 3.16 of the previous chapter). A discussion devoted to this feature is provided in Section 4.2.3.Before coming to this point, the main result [B19, Theorem 2.1] is recalled below and its proof iscommented in the next Section 4.2.2.

Theorem 4.3 (Pontryagin maximum principle). If (P∗, x∗,u∗) is a solution to Problem (Q) andψ issubmersive at (x∗(0), x∗(T )), then there exist λ≥ 0 and p ∈ AC([0,T ],Rn) such that:

(i) Nontriviality: (p,λ) 6= 0;


−p(t ) =∇x H(x∗(t ),u∗k , p(t ),λ, t ), a.e. t ∈ [t∗k , t∗k+1), ∀k = 0, . . . , N −1;

(iii) Transversality condition:(p(0)−p(T )

)=λ∇g (x∗(0), x∗(T ))+∇ψ(x∗(0), x∗(T ))>×ξ,

where ξ ∈ NS[ψ(x∗(0), x∗(T ))];

(iv) Nonpositive averaged Hamiltonian gradient condition:⟨∫ t∗k+1

t∗k∇u H(x∗(τ),u∗

k , p(τ),λ,τ) dτ, v −u∗k

⟩Rm

≤ 0,

for all v ∈ U and all k = 0, . . . , N −1;

(v) Hamiltonian continuity condition:

H(x∗(t∗k ),u∗k−1, p(t∗k ),λ, t∗k ) = H(x∗(t∗k ),u∗

k , p(t∗k ),λ, t∗k ),

for all k = 1, . . . , N −1.

Remark 4.4. The majority of the remarks listed after the statement of Theorem 3.6 in the previouschapter still hold in the present setting of free sampling times.

64


4.2.2 Comments on the proof of Theorem 4.3

Theorem 4.3 can be found in [B19, Theorem 2.1]. Its proof is based, as the proof of Theorem 3.6,on the Ekeland variational principle [138, Theorem 1.1]. However the taking into account of freesampling times in Problem (Q) requires several adjustments.

Firstly one has to perform the sensitivity analysis of the state equation with respect to perturba-tions tk ±δ of the sampling times tk . Actually, a perturbation tk +δ (for example) can be seenas an explicit needle-like L1-perturbation uδ of the control u, where uδ := uk−1 over [tk , tk +δ)and uδ := u elsewhere. Thus one simply obtain a variation vector w which is the unique global so-lution to the usual linearized control system with, above all, the initial condition given by w(tk ) =f (x(tk ),uk−1, tk )− f (x(tk ),uk , tk ). I refer to [B19, Appendix A.2.4] for all details. Note that thisexpression of the initial condition w(tk ) is of course at the origin of the Hamiltonian continuitycondition obtained in the item (v) of Theorem 4.3.

Secondly, when invoking the Ekeland variational principle, a sequence of sampled-data controls(ui )i∈N in the control set PCN ([0,T ],Rm), which converges in L1-norm to the optimal control u∗,is constructed. Here, three major difficulties emerge due to the fact that each sampled-data con-trol ui has its own N -partition P i = t i

k k=0,...,N ∈PN . These three difficulties are listed below.

(i) Firstly the sampling times t ik do not necessarily converge to the optimal sampling times t∗k

when i →+∞ (even up to subsequences);

(ii) Secondly a phenomenon of accumulation of the sampling times t ik when i →+∞ is possible.

The two above degenerate situations can occur, for example, if the optimal control u∗ is constantover two consecutive sampling intervals [t∗k−1, t∗k ) and [t∗k , t∗k+1). In order to overcome the twoabove difficulties, we introduced a technical control set (see [B19, Equation (5) in Appendix A.1])which guarantees that the sampling times t i

k produced by the Ekeland variational principle, firstly,remain unchanged for the ones corresponding to the consecutive sampling intervals on whichthe optimal control u∗ is constant (avoiding thus the difficulty (i)) and, secondly, are containedin disjoint intervals for the others (avoiding thus the difficulty (ii)). With this method, we wereable in [B19, Proposition A.2] to obtain the convergence of t i

k to t∗k when i → +∞ for each k =1, . . . , N −1. Moreover, at sampling times where the optimal control u∗ is constant over two con-secutive sampling time invervals, the Hamiltonian continuity condition is trivial (and thus doesnot require any proof), while, the other sampling times are let free (in disjoint intervals) and thuscan be subject to perturbations which led us to the Hamiltonian continuity condition.

(iii) A final obstacle lies in the lack of convexity of the control set PCN ([0,T ],Rm). Indeed, whenapplying the Ekeland variational principle, we are led to fix i ∈N and to proceed to perturba-tions of the control ui . The standard procedure of convex L∞-perturbations of the control(recalled in Section 3.3.5) would consist in fixing some v ∈ PCN ([0,T ],Rm) and to considerthe perturbed control ui ,δ := ui +δ(v −ui ). However, due to the above mentioned lack ofconvexity, note that ui ,δ ∉ PCN ([0,T ],Rm) in general.

In order to overcome this third difficulty, we have proved that, fixing some v ∈ PCN ([0,T ],Rm) andsince t i

k converges to t∗k when i → +∞, one can construct a sequence (v i )i∈N ∈ PCP i([0,T ],Rm),

respecting the partition P i associated to ui , which converges in L1-norm to v . I refer to [B19,proof of Lemma A.10] for all details. Hence, fixing i ∈ N, we introduced the perturbed controlgiven by ui ,δ := ui +δ(v i −ui ) which does belong to the control set PCN ([0,T ],Rm). Thanks tothis approach, Dhar and myself were able to obtain the Hamiltonian continuity condition statedin the item (v) of Theorem 4.3 by letting δ→ 0+ first, and then i →+∞. The rest of the proof ofTheorem 4.3 is similar to the proof of Theorem 3.6.

Remark 4.5. One may be interested in Problem (Q) with an additional constraint on the free sam-pling times tk of the form tk+1−tk ≥ Imin for all k = 0, . . . , N−1 for some Imin > 0 being fixed. In that

65


context, following the proof of [B19, Theorem 2.1], one can easily be convinced that the item (v) inTheorem 4.3 is preserved for all k ∈ 1, . . . , N −1 such that min(t∗k − t∗k−1, t∗k+1 − t∗k ) > Imin, but hasto be replaced by the weaker condition

H(x∗(t∗k ),u∗k−1, p(t∗k ),λ, t∗k ) ≤ H(x∗(t∗k ),u∗

k , p(t∗k ),λ, t∗k ),

for all k ∈ 1, . . . , N −1 such that t∗k − t∗k−1 = Imin and t∗k+1 − t∗k > Imin, and by the weaker condition

H(x∗(t∗k ),u∗k−1, p(t∗k ),λ, t∗k ) ≥ H(x∗(t∗k ),u∗

k , p(t∗k ),λ, t∗k ),

for all k ∈ 1, . . . , N − 1 such that t∗k − t∗k−1 > Imin and t∗k+1 − t∗k = Imin. However, if t∗k − t∗k−1 =t∗k+1 − t∗k = Imin, then no necessary optimality condition on t∗k can be derived.

Remark 4.6. During the reviewing process of the paper [B19], an anonymous reviewer broughtto our attention an alternative proof of Theorem 4.3. By adapting a remarkable technique pre-sented in the paper [136] by Dmitruk and Kaganovich, Problem (Q) can be reparameterized suchthat each sampling interval [tk , tk+1] maps to the interval [0,1]. In that situation, the free samplingtimes tk play the role of free terminal states which lead, through the application of the classicalPMP, to a transversality condition which exactly coincides with the Hamiltonian continuity con-dition, while the values uk of the sampled-data control play the role of parameters which lead,through the application of a parameterized version of the classical PMP (see, e.g., [B12, Remark 5])to a necessary optimality condition written in integral form which exactly coincides with the non-positive averaged Hamiltonian gradient condition. This alternative proof is elegant. However notethat this approach is strongly based on the standard chain rule which has no analogue neither intime scale calculus, nor in fractional calculus. As a consequence the technique developed in [136]cannot be used in order to extend Theorem 4.3 to the contexts considered in Chapters 2 and 3 ofthe present manuscript, while the proof based on the Ekeland variational principle considered inthe paper [B19] probably can.

4.2.3 Comments on the continuity of the maximized Hamiltonian function

For the needs of this section we denote by (QP ) the same problem than Problem (Q) but by consid-ering that the N -partition P = tk k=0,...,N ∈PN of the interval [0,T ] is fixed. Let (x∗

P ,u∗P ) be a solu-

tion to Problem (QP ) and let us apply Theorem 3.6 withT=R+, (a,b) = (0,T ) andT1 = P∪[T,+∞).Let us denote by (pP ,λP ) the Lagrange multiplier obtained. Since u∗

P is piecewise constant, onecan easily see from the state and ajoint equations that the optimal trajectory x∗

P and the adjointvector pP are piecewise smooth of class C1 over the interval [0,T ], in the sense that they are ofclass C1 over each interval [tk , tk+1]. It is also clear that the associated maximized Hamiltonianfunction HP : [0,T ] →R defined by

HP (t ) := H(x∗P (t ),u∗

P (t ), pP (t ),λP , t ),

for all t ∈ [0,T ], is piecewise smooth of class C1 over [0,T ], in the sense that HP is of class C1 overeach semi-open interval [tk , tk+1). Moreover, if H is differentiable with respect to t (for example,if f and L are so) and since the couple (x∗

P , pP ) satisfies the associated Hamiltonian system, itclearly holds that

HP (t ) =∇t H(x∗P (t ),u∗

P (t ), pP (t ),λP , t ),

over each semi-open interval [tk , tk+1). In particular, if additionally H is independent of the vari-able t (for example, if Problem (QP ) is autonomous, that is, if f and L are independent of thevariable t ), we deduce that HP is piecewise constant, in the sense that HP is constant over eachsemi-open interval [tk , tk+1). However we know that HP is not continuous over the whole inter-val [0,T ] in general. It may admit a discontinuity at each fixed sampling time tk (see Remark 3.16or Figure 4.1 in Section 4.2.4 below).

66


Now let (P∗, x∗,u∗) be a solution to Problem (Q) and let us apply Theorem 4.3. Let us denoteby (p,λ) the Lagrange multiplier obtained. The corresponding maximized Hamiltonian func-tion H : [0,T ] →R defined by

H (t ) := H(x∗(t ),u∗(t ), p(t ),λ, t ),

for all t ∈ [0,T ], satisfies all the properties underlined above. However, in the present situation ofoptimal sampling times t∗k , the Hamiltonian continuity condition in Theorem 4.3 implies that

limt→t∗kt<t∗k

H (t ) = H(x∗(t∗k ),u∗k−1, p(t∗k ),λ, t∗k ) = H(x∗(t∗k ),u∗

k , p(t∗k ),λ, t∗k ) =H (t∗k ) = limt→t∗kt>t∗k

H (t ),

for all k = 1, . . . , N −1, which exactly corresponds to the continuity of H at each optimal samplingtime t∗k . In that situation we conclude that H is continuous over the whole interval [0,T ]. Inparticular, if additionally H is independent of the variable t , then H is constant over the wholeinterval [0,T ].

I refer to Section 4.2.4 below for numerical illustrations of the above discussion. In particular theHamiltonian continuity condition in Theorem 4.3 is used in order to numerically compute theoptimal sampling times t∗k in a simple linear-quadratic example.

4.2.4 Numerical illustrations with a simple linear-quadratic example

Trélat and myself have provided in [B18, Theorem 2 and Corollary 1] a state feedback formulationfor the optimal sampled-data controls of unconstrained linear-quadratic problems in the case offixed sampling times. This result allows in particular to solve numerically these problems by im-plementing a simple induction. Note that the contributions of the paper [B18] will be summarizedin the next Chapter 5.

Dhar and myself have provided in [B19, Section 3] a method in order to solve numerically thesame unconstrained linear-quadratic optimal sampled-data control problems but in the case offree sampling times. Our idea was to implement a shooting method (with the MATLAB functionfsolve) based on the Hamiltonian continuity condition provided in Theorem 4.3, that is, on theshooting map

P = tk k=0,...,N ∈PN 7−→(H(x∗

P (tk ),u∗P,k−1, pP (tk ),λP , tk )−H(x∗

P (tk ),u∗P,k , pP (tk ),λP , tk )

)k=1,...,N−1

,

where, for each partition P ∈PN , the couple (x∗P ,u∗

P ) stands for the solution of the unconstrainedlinear-quadratic optimal sampled-data control problem with the fixed partition P , and where thecouple (pP ,λP ) stands for the Lagrange multiplier provided by Theorem 3.6. For each partition P ∈PN , note that the tuple (x∗

P ,u∗P , pP ,λP ) is numericallly computed following the induction method

provided in [B18, Theorem 2 and Corollary 1].

For illustrating purposes, let us focus now on the unconstrained linear-quadratic optimal sampled-data control problem (Qex) with free sampling times given by

minimize x(1)2 +∫ 1

03x(τ)2 +u(τ)2 dτ,

subject to P = tk k=0,...,N ∈PN , x ∈ AC([0,1],R), u ∈ PCP ([0,1],R),

x(t ) = x(t )−uk + t , a.e. t ∈ [tk , tk+1), ∀k = 0, . . . , N −1,

x(0) =−4.

(Qex)

For the needs of this section we denote by (QPex) the same problem than Problem (Qex) but by con-

sidering that P = tk k=0,...,N ∈PN is the uniform N -partition of the interval [0,1], that is, tk := kN for

all k = 0, . . . , N . In particular note that the sampling times tk are fixed in Problem (QPex). I present

67


hereafter the numerical simulations obtained with N = 4. I refer to Figure 4.1 for Problem (QPex)

(fixed uniform sampling times) and to Figure 4.2 for Problem (Qex) (free sampling times). In bothfigures the sampling times are represented with dashed lines. I emphasize that these numericalresults have been both confirmed by direct numerical approaches.

Figure 4.1 – Problem (QPex) with N = 4 (fixed uniform sampling times). Optimal cost C∗

P ' 44.5131.

Figure 4.2 – Problem (Qex) with N = 4 (optimal sampling times). Optimal cost C∗ ' 44.3159.

In Figure 4.1 (with fixed uniform sampling times tk ), as expected from Section 4.2.3, we observethat the maximized Hamiltonian function HP is continuous over each semi-open interval [tk , tk+1)and has discontinuities at each tk . In contrast, in Figure 4.2 (with optimal sampling times t∗k ), weobserve that the maximized Hamiltonian function H is continuous over the whole interval [0,1].

4.2.5 Perspectives

Two short-term perspectives are currently discussed with my PhD student Gaurav Dhar:

68


(i) A Filippov-type theorem for the existence of a solution to Problem (P) is stated in Theo-rem 3.5. This result can be applied to Problem (QP ), that is, to optimal sampled-data controlproblems with a fixed partition P ∈ PN (that is, with fixed sampling times). A possible ex-tension would be to consider the case of free sampling times. However, I emphasize that,in the context of free sampling times, one would probably be faced with a similar difficultyto the one encountered in the proof of Theorem 4.3. Precisely, considering a minimizingsequence of sampled-data controls ui in the control set PCN ([0,T ],Rm) would lead to a se-quence of partitions P i = t i

k k=0,...,N ∈ PN and thus to the possibility of accumulation of

sampling times t ik when i →+∞. As a consequence, a cautious and rigorous mathematical

treatment would be required in order to give a meaning to the limit of the sequence (ui )i∈Nwhen accumulations of sampling times appear. On the other hand, note that Theorem 3.5is established in a case where the control set is of infinite dimension, while the controlset PCN ([0,T ],Rm) considered in this section is of finite dimension. This fundamental dif-ference could lead to an existence result with possibly relaxed assumptions with respect tothat of Theorem 3.5 (precisely without any convexity assumption).

(ii) A possible challenge would be to add inequality state constraints in Problem (Q). In thatcontext one would obtain an adjoint vector q which is not absolutely continuous in gen-eral, but (only) of bounded variation. As a consequence, obtaining the continuity of thecorresponding maximized Hamiltonian function is not clear. On the other hand, in the stateconstrained case with fixed sampling times and under (quite unrestrictive) assumptions, wehave seen in Section 3.4 of Chapter 3 that the singular part of the adjoint vector q vanishesand that its jumps are localized exactly at the sampling times. This behavior may possiblybe compensated by letting the sampling times free. This interesting question is postponedto further research works.

Several other research projects are possible. For instance:

(iii) In view of initializations of numerical algorithms, it would be relevant to get theoretical re-sults about the distribution in the interval [0,T ] of the optimal sampling times with respectto N ∈N∗ and/or with respect to the data (cost, dynamics, constraints) of Problem (Q).

(iv) In Problem (Q), the positive integer N ∈N∗ (which corresponds to the maximal number ofswitches authorized to the control) is fixed. A natural project would be to let N ∈ N∗ beingfree by making the Mayer cost depends on N ∈N∗ (that is, by considering that each switchof the control has a price).

I conclude this section by evoking two perspectives which are both related to parts of the presentmanuscript:

(v) In the context of unconstrained linear-quadratic problems, Trélat and myself have provedin [B18, Theorem 1] that the optimal sampled-data control (with fixed sampling times) con-verges pointwisely to the optimal permanent control when the diameter of the correspond-ing partition tends to zero. The convergence of the corresponding cost and the uniformconvergences of the corresponding state and costate are also derived (see [B18, Remark 3]).I refer to the next Chapter 5 for details. An interesting research perspective would be to getsimilar convergence results in the context of the present section. Several directions can beinvestigated: nonlinear dynamics and terminal state constraints (see Chapter 6 for resultstowards this direction) and, of course, free sampling times. To go even further in the contextof free sampling times, a wonderful challenge would be to study the asymptotic behaviorwhen letting N tend to +∞ (which is a weaker condition than the convergence to zero of thediameter of the partition).

(vi) A relevant research perspective would concern the extension of Theorem 4.3 to the moregeneral framework in which the values of the free sampling times tk intervene explicitly

69


in the cost and/or in the dynamics of Problem (Q). I take this occasion to mention thepaper [B20] written in collaboration with Bakir, Bonnard and Rouot in which we derivedPontryagin-type conditions for a specific problem from medicine that can be written as anoptimal sampled-data control problem in which the sampling times tk are free and inter-vene explicitly in the expression of the dynamics. I precise that, even in this very particularcontext, giving an expression of the necessary optimality conditions in an Hamiltonian formstill remains an open mathematical question. The contributions of the paper [B20] are ex-actly the content of the next Section 4.3.

4.3 Application to optimal muscular force response to functional elec-trical stimulations

This section is extracted from the paper [B20] written jointly with Bakir, Bonnard and Rouot.

4.3.1 Ding et al. force-fatigue model

This section is dedicated to some reminders on the Ding et al. force-fatigue model [134]. Severalnotations will be introduced, but only for a temporary use. The FES input v (or pulses train) is ofthe form

v(t ) :=N−1∑k=0

vkδ(t − tk ), for all t ∈ [0,T ],

modeled as a finite sum of Dirac impulses δ at times 0 = t0 < t1 < . . . < tN−1 < tN = T , where N ∈N∗ and T > 0 are fixed, and where vk ∈ [0,1] are the amplitudes of each pulse. To describe thephenomenon of tetania (that is, the memory effect of successive pulses), a scaling factor Rk isgiven by

Rk := 1, for k = 0,

1+ (R −1)exp

(− tk − tk−1

τc

), for k = 1, . . . , N −1,

where the constants τc > 0 and R > 1 are given in Table 4.1 below. The FES signal E is defined asthe solution in the distributional sense to the linear scalar dynamics

E(t ) =−E(t )

τc+ 1

τc

N−1∑k=0

Rk vkδ(t − tk ), over [0,T ],

with E(0) = 0. We get that

E(t ) = 1

τc

N−1∑k=0

Rk exp

(− t − tk

τc

)vkΘ(t − tk ), for all t ∈ [0,T ],

whereΘ :R→R stands for the usual left-continuous Heaviside function. The FES signal drives theevolution of C a2+-concentration CN according to the linear scalar dynamics

CN (t ) =−CN (t )

τc+E(t ), a.e. t ∈ [0,T ]. (4.1)

Integrating the system (4.1) with CN (0) = 0 leads to

CN (t ) = 1

τc

N−1∑k=0

Rk exp

(− t − tk

τc

)vk (t − tk )Θ(t − tk ), for all t ∈ [0,T ].

Introducing the Hill functions β(t ) := CN (t )Km+CN (t ) and γ(t ) := 1

τ1+τ2β(t ) for all t ∈ [0,T ], the non-fatiguemodel [131, 132, 169] describes the force response F by the dynamics

F (t ) =−γ(t )F (t )+ Aβ(t ), for all t ∈ [0,T ], (4.2)

70


with F (0) = 0, where A, Km , τ1, τ2 are given positive constants.

The complete force-fatigue model [134] is obtained by considering A, Km , τ1 as fatigue variablesfollowing the three linear dynamics

A(t ) =− A(t )− Arest

τ f at+αAF (t ), Km(t ) =−Km(t )−Km,rest

τ f at+αKm F (t ),

τ1(t ) =−τ1(t )−τ1,rest

τ f at+ατ1F (t ), (4.3)

for all t ∈ [0,T ], with the initial conditions A(0) = Arest, Km(0) = Km,rest, and τ1(0) = τ1,rest. Table 4.1below contains the definitions and details on the variables and constants introduced in this sec-tion.

SymbolUnit Value DescriptionCN — — Normalized amount of C a2+-troponin complexF mN — Force generated by muscletk s — Time of the k th pulseN −1 — — Total number of the pulses before the final time Tk — — Stimulation pulse indexτc s 0.02 Time constant that commands the rise and the decay of CN

R — 1.143 Term of the enhancement in CN from successive stimuliA mN

s — Scaling factor for the force and the shortening velocity of muscleτ1 s — Force decline time constant when strongly bound cross-bridges

absentτ2 s 0.1244 Force decline time constant due to friction between actin and

myosinKm — — Sensitivity of strongly bound cross-bridges to CN

ArestmN

s 3.009 Value of the parameter A when muscle is not fatiguedKm,rest — 0.103 Value of the parameter Km when muscle is not fatiguedτ1,rest s 0.05095 The value of the parameter τ1 when muscle is not fatiguedαA

1s2 −4.0 10−1 Coefficient for the force-model parameter A in the fatigue model

αKm1

s·mN 1.9 10−2 Coefficient for the force-model parameter Km in the fatigue modelατ1

1mN 2.1 10−2 Coefficient for force-model parameter τ1 in the fatigue model

τ f at s 127 Time constant controlling the recovery of (A,Km ,τ1)

Table 4.1 – Values and description of the constant parameters in the Ding et al. model

4.3.2 A general rewritting of the model and necessary optimality conditions

Let us consider the complete force-fatigue model (4.1)-(4.2)-(4.3) and in what follows we denoteby x := (CN ,F , A,Km ,τ1)>. The model can be rewritten as the control system

x(t ) = F (x(t ))+b(t )

(N−1∑k=0

G(tk−1, tk )vkΘ(t − tk )

)e, a.e. t ∈ [0,T ], (4.4)

with the initial condition x(0) = x0, where x0 ∈R5 and e ∈R5 are the vectors defined by

x0 := (0,0, Arest,Km,rest,τ1,rest)> and e := (1,0,0,0,0)>,

71


where b(t ) := 1τc

e−tτc for all t ∈ [0,T ] and G(tk−1, tk ) := (R −1)e tk−1/τc + e tk /τc , with t−1 := −∞, and

finally where the explicit expression of the function F :R5 →R5 is given by

F (x) :=

− x1τc

− x1+x4x5(x1+x4)+τ2x1

x2 +x3x1

x1+x4

− x3−Arestτ f at

+αA x2

− x4−Km,rest

τ f at+αKm x2

− x5−τ1,rest

τ f at+ατ1 x2

,

for all x = (x1, x2, x3, x4, x5) ∈R5. In the paper [B20] we have considered the general optimal controlproblem of Mayer form given by

minimize g (x(T )),

subject to P = tk k=0,...,N ∈PN , x ∈ AC([0,T ],R5), (v0, . . . , vN−1) ∈RN ,

x(t ) = F (x(t ))+b(t )

(N−1∑k=0

G(tk−1, tk )vkΘ(t − tk )

)e, a.e. t ∈ [0,T ],

x(0) = x0,

vk ∈ [0,1], ∀k = 0, . . . , N −1,

tk − tk−1 ≥ Imin, ∀k = 1, . . . , N −1,

(4.5)

where N ∈N∗ and T > 0 are fixed, where g :R5 →R is a differentiable function and where Imin ≥ 0 isthe minimal interpulse authorized. The term g (x(T )) plays the role of a general cost of Mayer formto minimize (for instance the opposite of the final force response, as illustrated in Section 4.3.3).

Remark 4.7. One should note that the above optimal control problem (4.5) does not fit exactlywith the framework of optimal sampled-data control problems with free sampling times consid-ered in the previous Section 4.2 for two reasons. The first difference lies in the fact that the value ofeach amplitude vk intervenes in the control system (4.4) over the interval [tk ,T ] and not only onthe interval [tk , tk+1). The second difference lies in the fact that the values of the sampling times tk

intervene explicitly in the control system (4.4). As a consequence the necessary optimality condi-tions derived in Theorem 4.3 cannot be applied and have to be adapted. This is exactly the contentof the next theorem which corresponds to the main result of the paper [B20].

Theorem 4.8. If (P∗, x∗, v∗0 , . . . , v∗

N−1) is a solution to Problem (4.5), then the adjoint vector p =(p1, . . . , p5) defined as the unique global solution to the backward linear Cauchy problem given by

p(t ) =−∇F (x∗(t ))>p(t ), a.e. t ∈ [0,T ],

p(T ) =−∇g (x∗(T )),

satisfies:

(i) the inequality (∫ T

t∗kp1(s)b(s) d s

)vk ≤ 0,

for all k = 0, . . . , N − 1 and all admissible perturbations vk of v∗k (that is, all vk ∈ R such

that v∗k +εvk ∈ [0,1] for some ε> 0);

(ii) and the inequality(−p1(t∗k )b(t∗k )G(t∗k−1, t∗k )v∗

k +b(−t∗k )v∗k

∫ T

t∗kp1(s)b(s) d s

+b(−t∗k )(R −1)v∗k+1

∫ T

t∗k+1

p1(s)b(s) d s

)tk ≤ 0,

72


for all k = 1, . . . , N−1 and all admissible perturbations tk of t∗k (that is, all tk ∈R such that (t∗k +εtk )− t∗k−1 ≥ Imin and t∗k+1 − (t∗k +εtk ) ≥ Imin for some ε> 0).

Remark 4.9. Theorem 4.8 can be found in [B20, Theorem 3.1]. As mentioned in Remark 4.7, theoptimal control problem (4.5) does not fit exactly with the framework of optimal sampled-datacontrol problems with free sampling times considered in the previous Section 4.2. As a conse-quence, in the paper [B20], the techniques had to be adapted to the present framework. In par-ticular the sensitivity analysis of the state equation had to take into account the two facts that, inone hand, the sampling times tk intervene explicity in the dynamics and that, in the other hand,the values vk intervenes in the dynamics all along the interval [tk ,T ] and not only on the inter-val [tk , tk+1). I refer to [B20, Sections 3.3 and 3.4] for the details. As a conclusion of this remark, itappears that introducing and dealing with a very general framework, which allows to consider thetwo above issues, constitute an interesting challenge for future works. Furthermore obtaining thecorresponding necessary optimality conditons in a Hamiltonian form (in contrast to (i) and (ii) inTheorem 4.8) is also a perspective of interest.

4.3.3 Preliminary numerical results

My objective here is to give illustrations of Theorem 4.8 with some preliminary numerical simula-tions extracted from [B20, Section 4]. In this section we will focus on the problem of maximizingthe final force response x2(T ) expressed in milliNewton (mN), with fixed pulses amplitudes vk = 1for all k = 0, . . . , N −1, and with no minimal interpulse (that is, with Imin = 0). In what follows theunit of time will be the second (s). Note that all figures and all tables have been concatenated inthe next Section 4.3.4.

According to the above context, we will focus only on the necessary optimality condition (ii) pro-vided in Theorem 4.8. For all partitions P = tk k=0,...,N ∈PN , we define

NCk (P ) :=−p1(tk )b(tk )G(tk−1, tk )+b(−tk )∫ T

tk

p1(s)b(s) d s +b(−tk )(R −1)∫ T

tk+1

p1(s)b(s) d s,

for all k = 1, . . . , N −1. Assuming that the considered optimization problem admits a solution P∗ =t∗k k=0,...,N ∈PN , the item (ii) of Theorem 4.8 reduces to the equalities

∀k = 1, . . . , N −1, NCk (P∗) = 0. (4.6)

In [B20, Section 4] we have implemented the combination of several numerical methods:

• Firstly we used a direct method in order to compute a numerical approximation P of P∗ us-ing BOCOP software [119], which takes into account the constraint tk ≤ tk+1. Indeed notethat the numerical inversion of the elements tk is not allowed in our context since they ex-plicitly intervene in the control system (4.4) and they do not play symmetrical roles. In thecase N = 6 and T = 0.5, Figure 4.3 represents the values of ‖(NC1(P ), . . . ,NCN−1(P ))‖RN−1 inrelation with the final force response x2(T ) for many perturbations P = P+εχ of P , where ε>0 is a small parameter and χ is a uniform random variable in [−1,1]N−1.

• Secondly we computed the corresponding initial costate p(0) using specific and suitablenumerical integrators from the JULIA programming language [184]. The numerical resultsobtained in the above case (N = 6 and T = 0.5) are detailed in Table 4.2. The time evolu-tions of the corresponding state x and costate p (and also of the associated FES signal E) areprovided in Figures 4.4.

• Finally the couple (P , p(0)) was used as initialization of an indirect numerical method (shoot-ing method) based on Equalities (4.6) and using HAMPATH software [127] in order to numer-ically compute the optimal pulses times t∗k . Figure 4.5 represents the time evolutions of theoptimal state and costate (and also of the associated FES signal E) computed with the aboveshooting method, which was initialized by the values provided in Table 4.2. Note that theshooting method recovers the solutions found by the direct method (see Table 4.3).

73


In this manuscript I have recalled the numerical results obtained in [B20, Section 4] for (only) onecase (N = 6 and T = 0.5). Note that numerical results have been provided in other cases in [B20,Section 4] such as with N = 5 and T = 0.2, and with N = 11 and T = 0.8.

I conclude this section by emphasizing that the above results are (only) preliminary numerical re-sults. Further studies have to be conducted in order to provide a theoretical and numerical studyof the shooting equations (with possibly optimized pulses amplitudes and in presence of a posi-tive minimal interpulse Imin, with a large final time T and/or a large number N of pulses times,etc.). In particular comparisons should be provided with respect to suboptimal numerical strate-gies designed in [109] using a Model Predictive Control (MPC) coupled with online estimation ofthe fatigue variables using a nonlinear observer. Finally another challenge should be to take intoaccount inequality state constraints on the fatigue variables by adapting the results recently ob-tained in [B21] to the present context. The industrial research project whose aim is to design asmart electrical muscle stimulator is still in progress.

4.3.4 Figures and tables

Figure 4.3 – Case N = 6 and T = 0.5. Plot of the norm ‖(NC1(P ), . . . ,NCN−1(P ))‖RN−1 with respect to the finalforce response x2(T ) for 600 perturbations P of P . The red dot indicates the solution P computed by BOCOP.

k 1 2 3 4 5

tk = 0.241 0.314 0.369 0.414 0.448NCk (P ) = −0.06 −0.03 −0.01 −0.02 −0.007

Table 4.2 – Case N = 6 and T = 0.5. Numerical results from BOCOP software. x(T ) =(0.2652,0.2890,2.984,0.104,0.0522), p(0) = (4.6×10−4,3.1×10−3,0.0959,−0.8293,1.246).

k 1 2 3 4 5

t∗k = 0.226 0.303 0.362 0.409 0.477

Table 4.3 – Case N = 6 and T = 0.5. Numerical results from HAMPATH software. x(T ) =(0.2652,0.2901,2.984,0.104,0.0522), p(0) = (6×10−4,4.3×10−3,0.0963,−0.8812,1.3201).

74


Figure 4.4 – Case N = 6 and T = 0.5. Time evolutions of state x, costate p and FES signal E computed byBOCOP software.

Figure 4.5 – Case N = 6 and T = 0.5. Time evolutions of state, costate and FES signal computed by HAMPATHsoftware.

75


76

Chapter 5

Convergence results and unified Riccatitheory for linear-quadratic optimalpermanent and sampled-data controlproblems in finite and infinite timehorizons


5.2 A unified Riccati theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2.1 Notations for a unified setting . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2.2 Finite time horizon: permanent control versus sampled-data control (withfixed sampling times) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5.2.3 Infinite time horizon in autonomous setting: permanent control versus sampled-data control (with fixed uniform sampling times) . . . . . . . . . . . . . . . . 81

5.2.4 Discussion and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.3 Main results of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.3.1 A first convergence result obtained in the paper [B18] . . . . . . . . . . . . . 84

5.3.2 A commutative diagram of convergences obtained in the work [B23] . . . . 85


• [B18]: L. Bourdin and E. Trélat. Linear-quadratic optimal sampled-data control problems:convergence result and Riccati theory. Automatica J. IFAC, 79:273–281, 2017.

• [B23]: L. Bourdin and E. Trélat. Unified Riccati theory for optimal permanent and sampled-data control problems in finite and infinite time horizons. Submitted, 2020.

77

CHAPTER 5. CONVERGENCE RESULTS AND UNIFIED RICCATI THEORY FOR LQ OPTIMALPERMANENT AND SAMPLED-DATA CONTROL PROBLEMS

5.1 Introduction

In continous-time optimal control theory, we speak of a Linear-Quadratic problem (in short LQproblem) when the control system is a linear differential equation and the cost is given by a quadra-tic integral (see, e.g., [167]). One of the main results in that field is that the optimal permanentcontrol can be expressed as a linear state feedback. This expression is described by using theso-called Riccati matrix which is the solution to a nonlinear backward matrix Cauchy problemin finite time horizon (DRE: Differential Riccati Equation), and to a nonlinear algebraic matrixequation in infinite time horizon (ARE: Algebraic Riccati Equation). Since the pioneering worksby Maxwell, Lyapunov and Kalman (see the textbooks [167, 170, 192]), the so-called Riccati theoryhas been extended to many contexts, among which: discrete-time [166], stochastic [203], infinite-dimensional [129], fractional [173], etc.

One of these extensions concerns continuous-time LQ optimal sampled-data control problemsmotivated by electrical and mechanical engineering issues with applications for example to stringsof vehicles (see [105, 137, 164, 171, 177, 178, 187]). State feedback expressions and Riccati equa-tions have been derived in that context, but with various equivalent formulations due to the dif-ferent approaches developed. In most of the references, using the classical Duhamel formula,continuous-time LQ optimal sampled-data control problems are recast as discrete-time LQ opti-mal permanent control problems, and then the state feedback expression and the Riccati equationare obtained by applying the discrete-time dynamical programming principle (such as in [111,137, 164]) or by applying a discrete-time version of the Pontryagin maximum principle (such asin [105, 137, 165]). In the paper [B18] written in collaboration with Trélat, we have developed anovel approach by applying directly the Pontryagin maximum principle for continuous-time op-timal sampled-data control problems established in [B15, B16] (and recalled in Theorem 3.6). Inparticular our method preserved the original continuous-time writting of the LQ optimal sampled-data control problem, that is, without recasting it as a discrete-time problem.

Analogies between optimal permanent controls and optimal sampled-data controls in continuous-time LQ problems have been noticed in several works (see, e.g., [187] or [200, Remark 5.4]). In therecent work [B23], still in collaboration with Trélat, our objective was to provide a mathematicalframework in which the Riccati theories for continuous-time LQ optimal permanent and sampled-data control problems can be settled in a unified way. Precisely, in [B23, Section 2], we gathered ina unified setting the main results of continuous-time LQ optimal control theory in the followingfour situations: permanent control versus sampled-data control, and finite time horizon versusinfinite time horizon. To this aim, we introduced an important tool that is a map F (see its defini-tion in Section 5.2.1) thanks to which we formulated, in the above four situations, state feedbackexpressions and Riccati equations. I refer to Propositions 5.1, 5.3, 5.9 and 5.11 in Section 5.2 of thepresent chapter for details.

Furthermore, exploiting the continuity of the map F , we established in [B23, Theorem 1] conver-gence results between the involved Riccati matrices, either as the diameter of the partition goes tozero or as the finite time horizon goes to infinity. Hence four convergence results were obtainedand summarized in a single diagram (recalled in Section 5.3.2 of the present chapter). Some ofthese convergence results were already known in the literature, some others are new. I refer toRemarks 5.16 and 5.17 for details.

I conclude this Introduction by mentioning that a first convergence result was obtained in ourearlier paper [B18]. Precisely, in finite time horizon LQ problems, we proved in [B18, Theorem 1]that, when the diameter of the partition tends to zero, the corresponding optimal sampled-datacontrol pointwisely converges to the optimal permanent control. The proof of this result is basedon the Pontryagin maximum principle for continuous-time optimal sampled-data control prob-lems established in [B15, B16] (and recalled in Theorem 3.6). Furthermore we also derived in [B18,Remark 3] the uniform convergences of the corresponding state and costate, and also the conver-gence of the corresponding optimal cost. I refer to Section 5.3.1 for all details.

78


5.2 A unified Riccati theory

Throughout this chapter, given any k ∈N∗, we denote by S k+ (resp. S k++) the set of all symmetricpositive semidefinite (resp. positive definite) matrices of Rk×k . Let m, n ∈N∗ and Z ∈ S n+ . Then,for every t ∈ R, let A(t ) ∈ Rn×n , B(t ) ∈ Rn×m , X (t ) ∈ S n+ and Y (t ) ∈ S m++ be matrices dependingcontinuously on t . Finally letΦ(·, ·) stand for the state-transition matrix associated to A(·). In whatfollows we will preserve notations and terminology introduced in the previous Chapters 3 and 4.

5.2.1 Notations for a unified setting

Trélat and myself have considered in the work [B23] four different (unconstrained continuous-time) linear-quadratic optimal control problems: permanent control versus sampled-data control(with fixed sampling times), and finite time horizon versus infinite time horizon. Our aim was toprovide a unified presentation of the four corresponding Riccati theories. To this aim we intro-duced the map

F : R×S n+ ×R+ −→ Rn×n

(t ,R,ρ) 7−→ F (t ,R,ρ) :=M (t ,R,ρ)N (t ,R,ρ)−1M (t ,R,ρ)>−G (t ,R,ρ),

where M (t ,R,ρ) :=M1(t ,R,ρ)+M2(t ,R,ρ), N (t ,R,ρ) :=N1(t ,R,ρ)+N2(t ,R,ρ)+N3(t ,R,ρ) andG (t ,R,ρ) :=G1(t ,R,ρ)+G2(t ,R,ρ), with

if ρ > 0 if ρ = 0

M1(t ,R,ρ) := Φ(t , t −ρ)>R

(1

ρ

∫ t

t−ρΦ(t ,τ)B(τ) dτ

)RB(t )

M2(t ,R,ρ) := 1

ρ

∫ t

t−ρΦ(τ, t −ρ)>X (τ)

(∫ τ

t−ρΦ(τ, s)B(s) d s

)dτ 0Rn×m

N1(t ,R,ρ) := 1

ρ

∫ t

t−ρY (τ) dτ Y (t )

N2(t ,R,ρ) := 1

ρ

∫ t

t−ρ

(∫ τ

t−ρB(s)>Φ(τ, s)> d s

)X (τ)

(∫ τ

t−ρΦ(τ, s)B(s) d s

)dτ 0Rm×m

N3(t ,R,ρ) := 1

ρ

(∫ t

t−ρB(τ)>Φ(t ,τ)> dτ

)R

(∫ t

t−ρΦ(t ,τ)B(τ) dτ

)0Rm×m

G1(t ,R,ρ) := 1

ρ

∫ t

t−ρΦ(τ, t −ρ)>X (τ)Φ(τ, t −ρ) dτ X (t )

G2(t ,R,ρ) := 1

ρ

(Φ(t , t −ρ)>RΦ(t , t −ρ)−R

)A(t )>R +R A(t )

and we proved in [B23, Lemma 4] that the map F : R×S n+ ×R+ → Rn×n is well-defined and con-tinuous.

79


5.2.2 Finite time horizon: permanent control versus sampled-data control (with fixedsampling times)

The present section summarizes the content of [B23, Section 2.2] in which we have focused onlinear-quadratic optimal (permanent and sampled-data) control problems in finite time horizon.Our aim was to provide a unified presentation of the two corresponding Riccati theories by usingthe map F introduced in the previous Section 5.2.1. Let us start with the case of permanent controlin the next proposition.

Proposition 5.1 (Permanent control in finite time horizon). Let T > 0 and x0 ∈ Rn . The linear-quadratic optimal permanent control problem in finite time horizon T given by

minimize ⟨Z x(T ), x(T )⟩Rn +∫ T

0⟨X (τ)x(τ), x(τ)⟩Rn +⟨Y (τ)u(τ),u(τ)⟩Rm dτ,

subject to x ∈ AC([0,T ],Rn), u ∈ L2([0,T ],Rm),

x(t ) = A(t )x(t )+B(t )u(t ), a.e. t ∈ [0,T ],

x(0) = x0,

(LQPTx0

)

has a unique solution (x∗,u∗). Moreover u∗ is the (time-varying) state feedback

u∗(t ) =−N (t ,RT (t ),0)−1M (t ,RT (t ),0)>x∗(t ), a.e. t ∈ [0,T ],

where RT : [0,T ] →S n+ is the unique solution to the Permanent Differential Riccati Equation givenby

RT (t ) =F (t ,RT (t ),0), ∀t ∈ [0,T ],

RT (T ) = Z .(P-DRE)

Furthermore the minimal cost of (LQPTx0

) is equal to ⟨RT (0)x0, x0⟩Rn .

Remark 5.2. The mathematical content of Proposition 5.1 is very well known in the literature. Thestate feedback expression of the optimal control u∗ in Proposition 5.1 is usually written as

u∗(t ) =−Y (t )−1B(t )>RT (t )x∗(t ), a.e. t ∈ [0,T ],

and (P-DRE) is usually presented asRT (t ) = RT (t )B(t )Y (t )−1B(t )>RT (t )−X (t )− A(t )>RT (t )−RT (t )A(t ), ∀t ∈ [0,T ],

RT (T ) = Z .

I refer the reader to standard references such as [122, 167, 170, 192, 195].

Before dealing with sampled-data control in the next proposition, we first need to introduce somenotation. Given any T > 0 and any N -partition P = tk k=0,...,N ∈ PN of the interval [0,T ], wedenote by ρk := tk − tk−1 > 0 for all k = 1, . . . , N .

Proposition 5.3 (Sampled-data control in finite time horizon). Let T > 0 and x0 ∈ Rn . Let P =tk k=0,...,N ∈PN be a fixed N -partition of the interval [0,T ]. The linear-quadratic optimal sampled-data control problem in finite time horizon T given by

minimize ⟨Z x(T ), x(T )⟩Rn +∫ T

0⟨X (τ)x(τ), x(τ)⟩Rn +⟨Y (τ)u(τ),u(τ)⟩Rm dτ,


x(t ) = A(t )x(t )+B(t )uk , a.e. t ∈ [tk , tk+1), ∀k = 0, . . . , N −1,

x(0) = x0,

(LQPT,Px0

)

80


has a unique solution (x∗,u∗). Moreover u∗ is the (time-varying) state feedback

u∗k =−N (tk+1,RT,P

k+1,ρk+1)−1M (tk+1,RT,Pk+1,ρk+1)>x∗(tk ), ∀k = 0, . . . , N −1,

where RT,P = (RT,Pk )k=0,...,N ⊂ S n+ is the unique solution to the Sampled-Data Difference Riccati

Equation given byRT,P

k+1 −RT,Pk = ρk+1F (tk+1,RT,P

k+1,ρk+1), ∀k = 0, . . . , N −1,

RT,PN = Z .

(SD-DRE)

Furthermore the minimal cost of (LQPT,Px0

) is equal to ⟨RT,P0 x0, x0⟩Rn .

Remark 5.4. The extension of Proposition 5.1 to the sampled-data control case (as in Propo-sition 5.3) has various equivalent formulations in the literature. Using the Duhamel formula,Problem (LQPT,P

x0) is usually recast as a discrete-time linear-quadratic optimal permanent control

problem. In this way, the state feedback expression of the optimal control u∗ in Proposition 5.3and (SD-DRE) were first obtained in [164] by applying a discrete-time dynamical programmingprinciple (method revisited in [137, p. 616] or more recently in [111, Theorem 4.1]), while theyare derived in [105, Appendix B] or in [137, p. 618] by applying a discrete-time version of the Pon-tryagin maximum principle. These different approaches lead to different presentations of Proposi-tion 5.3 and the relationships with Proposition 5.1 (usually presented as in Remark 5.2) are hidden,while they are here very apparent thanks to the map F .

Remark 5.5. Note that we have established in [B18, Theorem 2 and Corollary 1] a more generalversion of Proposition 5.3 which takes into account nonhomogeneous terms in the descriptionof Problem (LQPT,P

x0). I refer to [B18, Section 2.1] for details. I emphasize that this result was

obtained with a different approach than the ones mentioned in the previous Remark 5.4. In-deed our proof was directly based on the application of the Pontryagin maximum principle forcontinuous-time optimal sampled-data control problems obtained in [B15, B16] (and recalled inTheorem 3.6). In particular our method allowed to preserve the original continuous-time writtingof Problem (LQPT,P

x0), that is, without recasting it as a discrete-time problem.

5.2.3 Infinite time horizon in autonomous setting: permanent control versus sampled-data control (with fixed uniform sampling times)

The present section summarizes the content of [B23, Section 2.3] in which we have focused onlinear-quadratic optimal (permanent and sampled-data) control problems in infinite time hori-zon. Our aim was to provide a unified presentation of the two corresponding Riccati theoriesby using the map F introduced in Section 5.2.1. Note that our framework (only) dealt with anautonomous setting (see Definition 5.6 below) and, in case of sampled-data control, with fixeduniform sampling times.

Definition 5.6. We speak of an autonomous setting when A(t ) ≡ A ∈Rn×n , B(t ) ≡ B ∈Rn×m , X (t ) ≡X ∈S n+ and Y (t ) ≡ Y ∈S m++ are constant with respect to t .

Remark 5.7. In the above autonomous setting, the state-transition matrix Φ satisfies Φ(t ,τ) =e(t−τ)A for all (t ,τ) ∈ R×R. In that case, using simple changes of variable (see [B23, Remark 1] fordetails), one can easily see that the map F , and also the maps Mi , Ni and Gi , do not depend onthe variable t and thus we can remove this dependence in the next results.

We denote by AC([0,+∞),Rn) the space of functions defined on [0,+∞) with values in Rn whichare absolutely continuous over all intervals [0,T ] with T > 0, and by L2([0,+∞),Rm) the Lebesguespace of square-integrable functions defined almost everywhere on [0,+∞) with values in Rm .In what follows, when dealing with the autonomous setting, we will consider the two followinghypotheses:

81


(H1) X ∈S n++.

(H2) For every x0 ∈Rn , there exists a pair (x,u) ∈ AC([0,+∞),Rn)×L2([0,+∞),Rm) such that x(t ) =Ax(t )+Bu(t ) for a.e. t ≥ 0 and x(0) = x0, satisfying∫ +∞

0⟨X x(τ), x(τ)⟩Rn +⟨Y u(τ),u(τ)⟩Rm dτ<+∞.

Remark 5.8. Assumption (H2) is known in the literature as optimizability assumption (or as finitecost assumption) and is related to various notions of stabilizability of linear permanent controlsystems (see [198]). A wide literature is dedicated to this topic (see [194] and references men-tioned in [130, Section 10.10]). Recall that, if the pair (A,B) satisfies the Kalman condition (see,e.g., [201, Theorem 1.2]) or only the weaker Popov–Belevitch–Hautus test condition (see, e.g., [194,Theorem 6.2]) then (H2) is satisfied.

As in the previous section we start with the permanent control case in the next proposition.

Proposition 5.9 (Permanent control in infinite time horizon). Assume that we are in the autonom-ous setting (see Definition 5.6). Let x0 ∈Rn . Under Assumptions (H1) and (H2), the linear-quadraticoptimal permanent control problem in infinite time horizon given by

minimize∫ +∞

0⟨X x(τ), x(τ)⟩Rn +⟨Y u(τ),u(τ)⟩Rm dτ,

subject to x ∈ AC([0,+∞),Rn), u ∈ L2([0,+∞),Rm),

x(t ) = A(t )x(t )+B(t )u(t ), a.e. t ≥ 0,

x(0) = x0,

(LQP∞x0

)

has a unique solution (x∗,u∗). Moreover u∗ is the state feedback

u∗(t ) =−N (R∞,0)−1M (R∞,0)>x∗(t ), a.e. t ≥ 0,

where R∞ ∈S n++ is the unique solution to the Permanent Algebraic Riccati Equation given byF (R∞,0) = 0Rn×n ,

R∞ ∈S n+ .(P-ARE)

Furthermore the minimal cost of (LQP∞x0

) is equal to ⟨R∞x0, x0⟩Rn .

Remark 5.10. The mathematical content of Proposition 5.9 is very well known in the literature.The state feedback expression of the optimal control u∗ in Proposition 5.9 is usually written as

u∗(t ) =−Y −1B>R∞x∗(t ), a.e. t ≥ 0,

and (P-ARE) is usually presented asR∞BY −1B>R∞−X − A>R∞−R∞A = 0Rn×n ,

R∞ ∈S n+ .

I refer the reader to standard references such as [122, 167, 170, 192, 195].

Before dealing with sampled-data control in the next proposition, we first need to introduce somenotation. Let ρ > 0. The ρ-uniform partition of the interval [0,+∞) is the sequence P = tk k∈Nwhere tk := kρ for every k ∈N. We denote by ‖P‖ := ρ and by PCP ([0,+∞),Rm) the space of func-tions defined on [0,+∞) with values inRm that are piecewise constant according to the partition P ,that is

PCP ([0,+∞),Rm) := u : [0,+∞) →Rm | ∀k ∈N, ∃uk ∈Rm , ∀t ∈ [tk , tk+1), u(t ) = uk .

We will also consider the following assumption that we call ρ-optimizability assumption:

82


(Hρ2 ) For every x0 ∈ Rn , there exists a pair (x,u) ∈ AC([0,+∞),Rn)×PCP ([0,+∞),Rm) such that

x(t ) = Ax(t )+Bu(t ) for a.e. t ≥ 0 and x(0) = x0, satisfying∫ +∞

0⟨X x(τ), x(τ)⟩Rn +⟨Y u(τ),u(τ)⟩Rm dτ<+∞.

Proposition 5.11 (Sampled-data control in infinite time horizon). Assume that we are in the au-tonomous setting (see Definition 5.6). Let P = tk k∈N be a fixed ρ-uniform partition of the in-terval [0,+∞) and let x0 ∈ Rn . Under Assumptions (H1) and (Hρ

2 ), the linear-quadratic optimalsampled-data control problem in infinite time horizon given by

minimize∫ +∞

0⟨X x(τ), x(τ)⟩Rn +⟨Y u(τ),u(τ)⟩Rm dτ,

subject to x ∈ AC([0,+∞),Rn), u ∈ PCP ([0,+∞),Rm),

x(t ) = A(t )x(t )+B(t )uk , a.e. t ∈ [tk , tk+1), ∀k ∈N,

x(0) = x0,

(LQP∞,Px0

)

has a unique solution (x∗,u∗). Moreover u∗ is the state feedback

u∗k =−N (R∞,P ,ρ)−1M (R∞,P ,ρ)>x∗(tk ), ∀k ∈N,

where R∞,P ∈S n++ is the unique solution to the Sampled-Data Algebraic Riccati Equation given byF (R∞,P ,ρ) = 0Rn×n ,

R∞,P ∈S n+ .(SD-ARE)

Furthermore the minimal cost of (LQP∞,Px0

) is equal to ⟨R∞,P x0, x0⟩Rn .

Remark 5.12. Similarly to the finite time horizon case (see Remark 5.4), the state feedback expres-sion of the optimal control u∗ in Proposition 5.11 and (SD-ARE) have various equivalent formula-tions in the literature (see [112, 171, 177, 178]). In most of these references, using the Duhamel for-mula, Problem (LQP∞,P

x0) is recast as a discrete-time linear-quadratic optimal permanent control

problem with infinite time horizon. With this approach, the optimizability of Problem (LQP∞,Px0

)is reduced to the optimizability of the corresponding discrete-time problem (see [137, Theorem 3]or [171, p.348]).

Remark 5.13. Similarly to Remark 5.5, and in contrast to the approach evoked in the previousRemark 5.12, we provided in [B23, Appendix A.2] a proof of Proposition 5.11 by keeping the originalcontinuous-time writting of Problem (LQP∞,P

x0). Our proof is an adaptation to the sampled-data

control case of the proof of Proposition 5.9 (which can be found for example in [122, p.153], [170,Theorem 7 p.198] or [195, Theorem 4.13]).

5.2.4 Discussion and perspectives

Our formulations of Propositions 5.1, 5.3, 5.9 and 5.11, by using the map F introduced in Sec-tion 5.2.1, provide a unified presentation of the four corresponding Riccati theories. Furthermore,since the map F is continuous, our setting was suitable in order to obtain in [B23] convergenceresults on the involved Riccati matrices (which are recalled in Section 5.3.2 of the present chapter).

Now let us discuss the optimizability assumptions introduced in the previous Section 5.2.3. Obvi-ously, if (Hρ

2 ) is satisfied for some ρ > 0, then (H2) is satisfied. In other words, (Hρ2 ) for a given ρ > 0

is stronger than (H2). Trélat and myself have proved in [B23, Lemma 1] that conversely, if (H1)and (H2) are satisfied, then there exists a threshold ρ > 0 such that (Hρ

2 ) is satisfied for every 0 < ρ ≤ρ. Furthermore, according to the proof of [B23, Lemma 1], a lower bound of the threshold ρ > 0

83


can be expressed in function of the norms of A, B , X , Y and R∞. Note that the proof of [B23,Lemma 1] is inspired from the techniques developed in [180] for preserving the stabilizing prop-erty of controls of nonlinear systems under sampling.

I conclude this section by emphasizing that Propositions 5.3 and 5.11 only deal with fixed samplingtimes, and even only with fixed uniform sampling times in Proposition 5.11. As far as I know,removing these constraints remains an open challenge. In context of free sampling times, are weable to determine the optimal sampling times in a feedback way?

5.3 Main results of convergence

As mentioned in Section 4.2.5 of Chapter 3, a natural question which can be addressed when deal-ing with continuous-time optimal sampled-data control problems with a fixed partition P is theasymptotic behavior when the diameter ‖P‖ of P tends to zero.

5.3.1 A first convergence result obtained in the paper [B18]

In this section we focus on the linear-quadratic optimal (permanent and sampled-data) controlproblems considered in Section 5.2.2, that is, in finite time horizon T > 0. Let x0 ∈ Rn and P =tk k=0,...,N be a fixed N -partition of the interval [0,T ]. For the needs of this section we denoteby (x∗

P ,u∗P ) the unique solution to Problem (LQPT,P

x0) provided in Proposition 5.3 and by C ∗

P thecorresponding optimal cost. Applying the Pontryagin maximum principle for continuous-timeoptimal sampled-data control problems obtained in [B15, B16] (and recalled in Theorem 3.6), wedenote by pP the corresponding adjoint vector and, using the nonpositive averaged Hamiltoniangradient condition, we get the open-loop formula

u∗P,k =

(1

tk+1 − tk

∫ tk+1

tk

Y (τ) dτ

)−1 (1

tk+1 − tk

∫ tk+1

tk

B(τ)>pP (τ) dτ

), ∀k = 0, . . . , N −1. (5.1)

On the other hand, let us denote by (x∗,u∗) the unique solution to Problem (LQPTx0

) provided inProposition 5.1 and by C ∗ the corresponding optimal cost. Applying the classical Pontryagin max-imum principle, we denote by p the corresponding adjoint vector and, using the maximizationHamiltonian condition, we get the open-loop formula

u∗(t ) = Y (t )−1B(t )>p(t ), a.e. t ∈ [0,T ]. (5.2)

Using the apparent relationship between Equalities (5.1) and (5.2), we were able in [B18, Theo-rem 1] to prove the next convergence theorem.

Theorem 5.14. If the diameter ‖P‖ of the partition P tends to zero, then x∗P (resp. pP ) uniformly

converges on [0,T ] to x∗ (resp. p). Furthermore C ∗P tends to C ∗, and u∗

P pointwisely convergeson [0,T ] to u∗.

As illustration of the above convergence theorem, let us focus on the linear-quadratic optimalpermanent control problem considered by Hager in [152, 153] given by

minimize∫ 1

0x(τ)2 + 1

2u(τ)2 dτ,

subject to x ∈ AC([0,1],R), u ∈ L2([0,1],R),

x(t ) = 12 x(t )+u(t ), a.e. t ∈ [0,1],

x(0) = 1,

(LQPex)

which satisfies all the assumptions considered in this chapter. The unique optimal permanentcontrol u∗ can be explicitly expressed as u∗(t ) = 2(e3t − e3)e−3t/2/(2 + e3) for a.e. t ∈ [0,1]. In

84


what follows, we denote by (LQPPex) the same problem as Problem (LQPex) but by considering a

sampled-data control u ∈ PCP ([0,1],R) where P = tk k=0,...,N is a fixed uniform N -partition ofthe interval [0,1], that is, tk := k

N for all k = 0, . . . , N . My aim now is to provide some numericalsimulations for Problem (LQPP

ex), with different values of N ∈ N∗, which are extracted from [B23,Section 3.3] and numerically computed from the feedback expression given in Proposition 5.3.When N tends to +∞ (and thus ‖P‖ tends to zero), the pointwise convergence of u∗

P to u∗ is de-picted in Figure 5.1, as expected from Theorem 5.14.

Figure 5.1 – Pointwise convergence of u∗P to u∗ as N tends to +∞ (and thus as ‖P‖ tends to zero).

As a conclusion of this section, note that Theorem 5.14 deals with convergence results in the caseof unconstrained linear-quadratic optimal sampled-data control problems. An open challenge isto address similar statements for general nonlinear optimal sampled-data control problems, andin the presence of control constraints and final state constraints. This is exactly the topic of thenext Chapter 6 which summarizes the contributions of the work [B24] in collaboration with Trélat.

5.3.2 A commutative diagram of convergences obtained in the work [B23]

Propositions 5.1, 5.3, 5.9 and 5.11 in Section 5.2 give state feedback expressions for optimal (per-manent and sampled-data) controls of linear-quadratic problems in finite and infinite time hori-zons. In each case, the optimal control is expressed thanks to a Riccati matrix (RT , RT,P , R∞

and R∞,P respectively). Our main result in [B23] asserts that, under suitable assumptions, thefollowing diagram of convergences commutes:

(SD-DRE) RT,P T→+∞ //

‖P‖→0

R∞,P

‖P‖→0

(SD-ARE)

(P-DRE) RTT→+∞

// R∞ (P-ARE)

85


The precise mathematical meaning of the above convergences is provided in the next theoremwhich is extracted from [B23, Theorem 1].

Theorem 5.15 (Commutative diagram). We have the following convergence results:

(i) Left arrow of the diagram: Given any T > 0, we have

lim‖P‖→0

maxk=0,...,N

‖RT (tk )−RT,Pk ‖Rn×n = 0,

for all N -partitions P = tk k=0,...,N ∈PN of the interval [0,T ].

(ii) Bottom arrow of the diagram: Assume that Z = 0Rn×n and that we are in the autonomoussetting (see Definition 5.6). Under Assumptions (H1) and (H2), we have

limT→+∞

RT (t ) = R∞, ∀t ≥ 0.

(iii) Top arrow of the diagram: Assume that Z = 0Rn×n and that we are in the autonomous setting(see Definition 5.6). Let P = tk k∈N be a ρ-uniform partition of the interval [0,+∞). Forall N ∈N∗, we denote by PN := P ∩ [0, tN ] which is a N -partition of the interval [0, tN ]. UnderAssumptions (H1) and (Hρ

2 ), we have

limN→+∞

R tN ,PN

k = R∞,P , ∀k ∈N.

(iv) Right arrow of the diagram: In the autonomous setting (see Definition 5.6), under Assump-tions (H1) and (H2), we have

limρ→0

R∞,P = R∞,

for all ρ-uniform partitions P = tk k∈N of the interval [0,+∞) with 0 < ρ ≤ ρ (see Section 5.2.4for details on the threshold ρ > 0).

Remark 5.16. The proof of Theorem 5.15 can be found in [B23, Appendix A.4] and is briefly com-mented in the next list.

• The first item has been derived by using the continuity of the map F and by adapting theclassical proof of the Lax theorem in numerical analysis (see, e.g., [181, p.73]).

• The second item is a very well known fact and follows from the proof of Proposition 5.9(which can be found in [122, p.153], [170, Theorem 7] or [195, Theorem 4.13]).

• The third item follows from our proof of Proposition 5.11 given in [B23, Appendix A.2] (whichis an adaptation of the proof of Proposition 5.9 to the sampled-data control context).

• The last item is proved in [B23, Appendix A.4] by using in particular the ρ-optimizabilityof Problem (LQP∞,P

x0) for every 0 < ρ ≤ ρ. I emphasize that the ρ-optimizability obtained

in [B23, Lemma 1] is uniform (in the sense that a bound independent of 0 < ρ ≤ ρ is ob-tained). This uniform bound plays a crucial role in order to bound the family (R∞,P )0<‖P‖≤ρand finally, by using the continuity of the map F , to obtain the convergence of R∞,P to R∞

as ‖P‖ tends to zero.

Remark 5.17. Some results similar to the four items of Theorem 5.15 have already been discussedin the literature. For example, in the autonomous setting and with uniform partitions, the firstitem of Theorem 5.15 has been proved in [105, Corollary 2.3] (a second-order convergence haseven been derived). As evoked in Remarks 5.4 and 5.12, in the literature, the linear-quadratic op-timal sampled-data control problems are usually rewritten as discrete-time problems. As a con-sequence the result of the third item of Theorem 5.15 is usually reduced in the literature to thecorresponding result at the discrete level (see [137, Theorem 3] or [171, p.348]). Finally note thatsensitivity analysis of (SD-ARE) with respect to ρ has been explored in [145, 171, 177] by computingits derivative algebraically in view of optimization of the sampling period ρ. Note that the map F

defined in Section 5.2.1 is a suitable candidate in order to invoke the classical implicit functiontheorem and to justify the differentiability of R∞,P with respect to ρ.

86

Chapter 6

Convergence in nonlinear optimalsampled-data control problems withfixed endpoint


6.2 Framework and preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.2.1 Optimal permanent control problem . . . . . . . . . . . . . . . . . . . . . . . 89

6.2.2 The corresponding optimal sampled-data control problem . . . . . . . . . . 90

6.3 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.3.1 A technical lemma of reachability . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.3.2 Convergence results and comments . . . . . . . . . . . . . . . . . . . . . . . . 93

6.3.3 Illustration of convergence with a classical parking problem . . . . . . . . . 95

The present chapter summarizes the contributions of the following reference:

• [B24]: L. Bourdin and E. Trélat. Convergence in nonlinear optimal sampled-data controlproblems with fixed endpoint. Work in progress (submitted soon), 2020.

87

CHAPTER 6. CONVERGENCE IN NONLINEAR OPTIMAL SAMPLED-DATA CONTROLPROBLEMS WITH FIXED ENDPOINT

6.1 Introduction

In the previous Chapter 5, we have stated in Theorem 5.14 (extracted from [B18, Theorem 1]) thatthe optimal sampled-data control in an unconstrained (continuous-time) linear-quadratic prob-lem converges pointwisely to the optimal permanent control when the diameter of the (fixed)partition tends to zero. The proof of Theorem 5.14 is based on open-loop formulas for the optimalsampled-data and permanent controls derived from Pontryagin maximum principles, and on theuniform convergences of the state and costate associated with the optimal sampled-data controlto the ones associated with the optimal permanent control. In Theorem 5.14 the convergence ofthe cost associated with the optimal sampled-data control to the one associated with the optimalpermanent control is also established. I refer to Section 5.3.1 for details.

Theorem 5.14 deals (only) with unconstrained linear-quadratic optimal control problems (in par-ticular with no control constraint and no final state constraint). As mentioned at the end of Sec-tion 5.3.1, an open challenge would be to address a similar statement for general nonlinear optimalcontrol problems with fixed endpoint and in the presence of control constraints. This is exactly thetopic of the recent (and ongoing) work [B24] jointly with Trélat in which we have obtained someconvergence results in that direction. Precisely, invoking a strategy similar to the one used in theproof of the classical Filippov theorem [143] (and thus assuming compactness-convexity assump-tions), we are able in [B24, Theorem 1] to obtain the convergences of the state and cost associatedwith the optimal sampled-data control to the ones associated with the optimal permanent con-trol. This result is recalled in Theorem 6.15 of the present chapter and I refer to Remark 6.16 forcomments on its proof.

Unfortunately this first result does not provide any information on the convergence of the opti-mal sampled-data control. Hence our objective in [B24] is to propose an indirect way in the caseswhere open-loop formulas are derivable from Pontryagin maximum principles. Precisely our ideais to refine the convexity assumption of [B24, Theorem 1] in order to derive moreover the uniformconvergence of the costate associated with the optimal sampled-data control to the one associ-ated with the optimal permanent control. This is exactly the content of our main result [B24,Theorem 2] which is recalled in Theorem 6.17 of the present chapter and I refer to Remark 6.18 forcomments on its proof. An illustration of this convergence result is provided in Section 6.3.3 of thepresent chapter (extracted from [B24, Section III.B]) with the numerical resolution of a classicalparking problem with sampled-data control.

As mentioned previously, the Filippov routine is used in order to handle the nonlinearity of theproblems considered in the work [B24]. Actually one of the most difficult part of our work [B24]lies in the fact that the final state is fixed. Indeed a fundamental question that we had to addressin a first place, independently of our considerations about convergence, was the robustness of thereachability of the target under sampling. A counterexample in which the target is reachable usinga permanent control, but is not reachable using a sampled-data control (even for small values ofthe diameter of the partition), can be easily constructed (see Example 6.10). In such a context, ob-viously, one cannot expect any convergence result when the diameter of the partition tends to zero(since the corresponding optimal sampled-data control problem is not feasible). Therefore wehave established in [B24, Lemma 1] a technical lemma of reachability which asserts that, if a targetis reachable with a permanent control which has no abnormal weak extremal lift (see Section 6.2below for the precise definition), then there exists a threshold such that the target is reachable witha sampled-data control provided that the associated partition has a diameter less than this thresh-old. The proof of this result is based on the conic implicit function theorem [104, Theorem 1] butseveral difficulties (due to the fact that piecewise constant functions are not dense in the space ofbounded measurable functions endowed with the L∞-norm) have to be overcome. This technicallemma of reachability, which might be of independent interests for researchers in continuous-time control theory with sampled-data controls, is recalled in Lemma 6.11 of the present chapterand I refer to Remark 6.12 for comments on its proof.

88


6.2 Framework and preliminaries

In the whole chapter we fix T > 0 and m, n ∈ N∗. In what follows we will preserve the notationsand the terminology introduced in the previous Chapters 3, 4 and 5.

6.2.1 Optimal permanent control problem

In the ongoing work [B24] in collaboration with Trélat, we consider the general nonlinear (con-tinuous-time) optimal permanent control problem with fixed endpoint given by

minimize∫ T


subject to x ∈ AC([0,T ],Rn), u ∈ L∞([0,T ],Rm),

x(t ) = f (x(t ),u(t ), t ), a.e. t ∈ [0,T ],

x(0) = x0, x(T ) = xT ,

u(t ) ∈ U, a.e. t ∈ [0,T ],

(FEP)

where x0, xT ∈ Rn are fixed and where the control constraint set U is a nonempty closed convexsubset of Rm . In this chapter we assume that the dynamics f : Rn ×Rm × [0,T ] → Rn and theLagrange cost function L :Rn×Rm×[0,T ] →R are of class C2 (see Remark 6.20 for a brief discussionon that assumption). As usual we will make use of the Hamiltonian function H :Rn ×Rm ×Rn ×R×[0,T ] →R associated with Problem (FEP) defined by

H(x,u, p,λ, t ) := ⟨p, f (x,u, t )⟩Rn −λL(x,u, t ),

for all (x,u, p,λ, t ) ∈Rn ×Rm ×Rn ×R× [0,T ].

Definition 6.1 (Strong and weak extremal lifts). Let (x,u) be an admissible couple for Problem (FEP)and (p,λ) ∈ AC([0,T ],Rn)×R+ be a nontrivial couple satisfying the usual adjoint equation

− p(t ) =∇x H(x(t ),u(t ), p(t ),λ, t ), a.e. t ∈ [0,T ]. (AE)

We say that (p,λ) is a strong extremal lift of (x,u) if, moreover, the Hamiltonian maximizationcondition

u(t ) ∈ argmaxv∈U

H(x(t ), v, p(t ),λ, t ), a.e. t ∈ [0,T ], (HM)

is satisfied. However the couple (p,λ) is said to be a weak extremal lift of (x,u) if, only, the (weaker)Hamiltonian gradient condition

∇u H(x(t ),u(t ), p(t ),λ, t ) ∈ NU[u(t )], a.e. t ∈ [0,T ], (HG)

is satisfied.

Remark 6.2. In the context of Definition 6.1, a (strong or weak) extremal lift (p,λ) is said to be nor-mal if λ> 0, and abnormal if λ= 0. In the normal case, it is usual to renormalize the couple (p,λ)so that λ= 1.

Remark 6.3. According to the classical Pontryagin maximum principle [182], if (x∗,u∗) is a solu-tion to Problem (FEP), then it necessarily admits a strong extremal lift (p,λ).

Remark 6.4. If (p,λ) is a strong extremal lift of an admissible couple (x,u), then it is obviously aweak extremal lift. If H is concave with respect to its second variable, then the reverse is true. How-ever the reverse is not true in general. A counterexample is provided by considering T = m = n = 1,x0 = xT = 0, U = [−1,1], the dynamics f (x,u, t ) = u3 and the Lagrange cost function L(x,u, t ) = 0for all (x,u, t ) ∈ R×R× [0,1]. Let us consider λ ∈ 0,1, x(t ) = u(t ) = 0 and p(t ) = 1 for all t ∈ [0,1].

89


One can easily see that (p,λ) is a weak extremal lift of the admissible couple (x,u), but is not astrong extremal lift. Furthermore, in that counterexample, note that the admissible couple (x,u)is a solution to Problem (FEP) and that the weak extremal lift can be normal (λ = 1) or abnormal(λ= 0).

Remark 6.5. In this remark we first recall some notations introduced previously in the manuscript.Precisely, recall that E ⊂ C([0,T ],Rn) stands for the set of all trajectories x ∈ C([0,T ],Rn) that canbe associated to a control u ∈ L∞([0,T ],Rm) such that the couple (x,u) is admissible for Prob-lem (FEP) and recall that the set of augmented velocities is defined by

( f ,L+)(x,U, t ) := ( f (x,u, t ),L(x,u, t )+γ) | (u,γ) ∈ U×R+.

According to the classical Filippov theorem [143], if the compactness hypothesisE is nonempty and bounded in C([0,T ],Rn),

U is compact,(Hcomp)

and the convexity hypothesis

( f ,L+)(x,U, t ) is convex for all (x, t ) ∈Rn × [0,T ], (Hconv)

are both satisfied, then Problem (FEP) has (at least) one solution.

6.2.2 The corresponding optimal sampled-data control problem

In the ongoing work [B24] we also consider the corresponding nonlinear (continuous-time) opti-mal sampled-data control problem with fixed endpoint given by

minimize∫ T




x(0) = x0, x(T ) = xT ,

uk ∈ U, ∀k = 0, . . . , N −1,

(FEPP )

for all N -partitions P = tk k=0,...,N ∈ PN of the interval [0,T ]. In what follows we denote by P :=∪N∈N∗PN the set of all partitions of the interval [0,T ].

Definition 6.6 (P-averaged weak extremal lift). Let P = tk k=0,...,N ∈ PN be a N -partition of theinterval [0,T ]. Let (x,u) be an admissible couple for Problem (FEPP ) and (p,λ) ∈ AC([0,T ],Rn)×R+be a nontrivial couple satisfying the adjoint equation (AE). We say that (p,λ) is a P-averaged weakextremal lift of (x,u) if, moreover, the averaged Hamiltonian gradient condition∫ tk+1

tk

∇u H(x(τ),uk , p(τ),λ,τ) dτ ∈ NU[uk ], ∀k = 0, . . . , N −1, (AHG)

is satisfied.

Remark 6.7. In the context of Definition 6.6, a P-averaged weak extremal lift (p,λ) is said to benormal if λ > 0, and abnormal if λ = 0. In the normal case, it is usual to renormalize the cou-ple (p,λ) so that λ= 1.

Remark 6.8. Let P ∈P be a partition of the interval [0,T ]. According to the Pontryagin maximumprinciple for continuous-time optimal sampled-data control problems established in [B15, B16](and recalled in Theorem 3.6), if (x∗

P ,u∗P ) is a solution to Problem (FEPP ), then it necessarily admits

a P-averaged weak extremal lift denoted by (pP ,λP ).

90


Remark 6.9. Let P ∈ P be a partition of the interval [0,T ]. Let E P ⊂ C([0,T ],Rn) stand for the setof all trajectories x ∈ C([0,T ],Rn) that can be associated to a control u ∈ PCP ([0,T ],Rm) such thatthe couple (x,u) is admissible for Problem (FEPP ). If the compactness hypothesis

E P is nonempty and bounded in C([0,T ],Rn),

U is compact,(HP

comp)

is satisfied, then Problem (FEPP ) has (at least) one solution. I emphasize that, since PCP ([0,T ],Rm)is a finite dimensional space (in contrast to the permanent control case in which L∞([0,T ],Rm) isan infinite dimensional space), this Filippov-type existence result does not require any convexityassumption. As far as I know, this simple result is new in the literature and I refer to [B24, Ap-pendix D Section A] for a detailed proof.

6.3 Main results

In the whole section, for the ease of notations, we denote by

C (x,u) :=∫ T


for all (x,u) ∈ AC([0,T ],Rn)×L∞([0,T ],Rm). Moreover we denote by L∞U the subset of L∞([0,T ],Rm)

of functions with values in U. Similarly, if P ∈P , we denote by PCPU the subset of PCP ([0,T ],Rm) of

functions with values in U.

Our major aim in [B24] is to provide convergence results on optimal elements (such as state, con-trol, cost) of Problem (FEPP ) for P ∈P to the ones of Problem (FEP) when the diameter ‖P‖ tendsto zero. I refer to Theorems 6.15 and 6.17 in Section 6.3.2 for our main results in that direction, andto Section 6.3.3 for a numerical illustration in the simple context of a classical parking problem.

However, before coming to these results, note that Problem (FEP) (resp. Problem (FEPP ) for somepartition P ∈P ) can be seen as the minimization problem of the functional C over E (resp. E P ). Asa consequence the nonemptiness of E (resp. E P ), which coincides with the notion of reachabilityof the target xT from the initial condition x0 with a L∞

U -control (resp. PCPU-control), is a fundamen-

tal question that we had to consider in a first place, independently of our considerations aboutconvergence. This is exactly the content of the next preliminary Section 6.3.1 (extracted from [B24,Section III.C]).

6.3.1 A technical lemma of reachability

In this section we will focus on the relationships between the nonemptiness of E and the one of E P

for some P ∈P . Firstly note the obvious fact that, if E P 6= ; for some P ∈P , then E 6= ;. Howeverthe reverse is not true in general, even for small values of ‖P‖, as illustrated in the next example.

Example 6.10. Consider T = m = 1, n = 2, x0 = (0,0), xT = (0,1), U = R and f ((x1, x2),u, t ) =((u − x2)2,1) for all ((x1, x2),u, t ) ∈ R2 ×R× [0,1]. In that context E 6= ; since it is reduced to asingleton (x,u) where u(t ) = t for a.e. t ∈ [0,1]. Since u ∉ PCP ([0,T ],Rm) for all P ∈P , we deducethat E P =; for all P ∈P (even for small values of ‖P‖).

Assuming that E 6= ;, that is the target xT is reachable from the initial condition x0 with a L∞U -

control, and in view of getting the same reachability but with a PCPU-control, our idea in [B24] is to

use the conic implicit function theorem [104, Theorem 1]. To this aim, the surjectivity of the differ-ential of the final input-output map (see [B24, Definition 2] for the precise definition) is required.It turns out that this (abstract) surjectivity property can be characterized by the (practical) absenceof abnormal weak extremal lift. Therefore we formulate in [B24, Lemma 1] the following technicallemma of reachability which might be of independent interests for researchers in continuous-timecontrol theory with sampled-data controls.

91


Lemma 6.11. If (x,u) ∈ E has no abnormal weak extremal lift, then there exists a threshold ρ > 0such that E P 6= ; for all P ∈P satisfying ‖P‖ ≤ ρ.

Remark 6.12. Lemma 6.11 can be found in [B24, Lemma 1]. I would like to underline that itsproof, which can be found in [B24, Appendix D Section B], is not a simple and direct applica-tion of the conic implicit function theorem [104, Theorem 1]. Indeed several difficulties had tobe overcome in a first place. Indeed it is well known that a L∞-control cannot be approximatedin L∞-norm by a PCP -control in general, even for small values of ‖P‖. A counterexample is easilyconstructed with a Füller-type L∞-control for instance. Nevertheless a L∞-control can be approx-imated in Ls-norm, with any 1 ≤ s <+∞, by using an averaged PCP -control (see [B24, Appendix ASection B] for details). Moreover, since U is convex, this method proves that a L∞

U -control can beapproximated in Ls-norm by a PCP

U-control (see [B24, Remark 15] for details). Our idea is thusto consider the differential of the final input-output map with respect to the Ls-norm. Howeverthe final input-output map is not well defined when considering a general nonlinear control sys-tem with a Ls-control. Our idea is thus to introduce a truncated version of the dynamics f whichvanishes outside a sufficiently big compact subset of Rn ×Rm (see [B24, Appendix B Section B]). Inthat truncated context, the final input-output map is well defined, but is not Fréchet-differentiablewhen s = 1. However it is of class C1 when 1 < s <+∞ and the surjectivity of its differential can berelated to the one of the differential in L∞-norm of the final input-output map in the nontruncatedsetting. This is a key technical point in the proof of Lemma 6.11.

Remark 6.13. I take this occasion to mention the remarkable series [148, 149, 150, 193] of papersby Grasse and Sussmann about reachability and controllability with a piecewise constant control.Hereafter I provide a (nonexhaustive) list of results obtained by these two authors. The relativepositioning of Lemma 6.11 is detailed next.

(i) It is established in [193, Theorem 4.2] that the normal reachability of a point in the statespace implies its normal reachability with a piecewise constant control. I recall that normalreachability means, roughly speaking, reachability under a surjectivity assumption highlyrelated to the absence of abnormal weak extremal lift considered in the present work.

(ii) With a different point of view (not based on a surjectivity property) and with almost no as-sumption, it is remarkably established in [150, Theorem 3.17] that the global controllabilityof a control system implies the global controllability with a piecewise constant control (evenwith the restriction to be with values in a countable dense subset of U). Recall that the globalcontrollability means that for each pair of points in the state space, there exists a trajectorythat connects the two points.

(iii) In [148, Remark 3.5] the author evokes that if a point of the state space is normally reachablein time less than T then it belongs to the interior of the reachable set with a piecewise-constant control in time less than T .

(iv) It is established in [149, Corollary 4.4] that, if the initial condition belongs to the interiorof the reachable set, then this reachable set is equal to the reachable set with a piecewiseconstant control (even with the restriction to be with values in a countable dense subsetof U).

I would like to underline that Lemma 6.11 differs from the above results for two main reasons.Firstly the final time T > 0 is fixed in the present chapter, while it is not in the above mentionedcontexts. For instance, in [193, Theorem 4.2], the (normal) reachability with a piecewise con-stant control is established, but a priori for a different final time T ′. Secondly Lemma 6.11 provesthe existence of a threshold ρ > 0 for which the reachability (at exactly time T ) of a point of thestate space with a piecewise constant control is guaranteed for every P ∈ P such that ‖P‖ ≤ ρ.The existence of this threshold (which is not considered in the works by Grasse and Sussmann)is of particular interests when considering refinements of partitions (for convergence results for

92


instance, as we focus on in the work [B24]). Furthermore, since the inclusion ⊂ is not a total or-der over P , it is possible that ‖P2‖ ≤ ‖P1‖ while P2 6⊂ P1 for some P1, P2 ∈ P . With the works ofGrasse and Sussmann, it is not guaranteed that the reachability of a point of the state space witha PCP1

U -control implies its reachability with a PCP2U -control. With Lemma 6.11, if ‖P1‖ ≤ ρ, then it

is guaranteed.

Remark 6.14. A possible challenge for a further research work would be to provide a lower boundon the threshold ρ > 0 given in Lemma 6.11 in function of the data f , U, x0, xT and T .

6.3.2 Convergence results and comments

Thanks to the technical Lemma 6.11 in the previous Section 6.3.1, we are able in [B24] to derivethe next convergence results. In the first next theorem, the convergences of the optimal trajectoryand of the optimal cost are established.

Theorem 6.15. Let us assume that Hypotheses (Hcomp) and (Hconv) are both satisfied. If the solu-tion (x∗,u∗) to Problem (FEP) (given in Remark 6.5) is unique and has no abnormal weak extremallift, then there exists ρ > 0 such that, for all P ∈ P satisfying ‖P‖ ≤ ρ, Problem (FEPP ) admits asolution (x∗

P ,u∗P ). Furthermore it holds that:

(i) x∗P converges uniformly over [0,T ] to x∗;

(ii) C (x∗P ,u∗

P ) converges to C (x∗,u∗);

when ‖P‖ tends to zero.

Remark 6.16. Theorem 6.15 can be found in [B24, Theorem 1]. Its proof, which can be foundin [B24, Appendix D Section C], starts by applying Lemma 6.11 in order to prove the existenceof a threshold ρ > 0 such that E P 6= ; for all P ∈ P satisfying ‖P‖ ≤ ρ. Thus (HP

comp) is satisfied

and Problem (FEPP ) admits a solution (x∗P ,u∗

P ) for all P ∈ P satisfying ‖P‖ ≤ ρ (see Remark 6.9).Then the proof is essentially based on the same tools than the one of the classical Filippov the-orem. Nevertheless I emphasize that we have to involve truncated dynamics (as in the proof ofLemma 6.11, see Remark 6.12) and truncated Lagrange cost function in order to take advantage ofthe continuity of the truncated cost functional with respect to the L1-norm (which is not true if thecost functional is not truncated) and the density of ∪P∈P PCP

U in L∞U endowed with the L1-norm

(which is not true if we endow L∞U with the L∞-norm).

Unfortunately the above theorem does not provide any information on the convergence of theoptimal sampled-data control u∗

P to the optimal permanent control u∗ when the diameter ‖P‖tends to zero. Hence our objective in [B24] is to propose an indirect way (based on the Pontrya-gin maximum principles and the extremal lifts) that could fill this gap in some particular situa-tions. Roughly speaking it is well known that the Hamiltonian maximization condition (HM) (orthe weaker Hamiltonian gradient condition (HG)) allows in numerous cases to express the opti-mal permanent control u∗ as an open-loop control, that is, as a function Υ of the state-costatevector (x∗, p):

u∗(t ) =Υ(x∗(t ), p(t ), t ), a.e. t ∈ [0,T ]. (6.1)

Similarly, for a N -partition P = tk k=0,...,N ∈ PN of the interval [0,T ], the averaged Hamiltoniangradient condition (AHG) may allow to express the values u∗

P,k of the optimal sampled-data con-trol u∗

P as functions Υk of the restrictions to the sampling intervals [tk , tk+1] of the state-costatevector (x∗

P , pP ):

u∗P,k =Υk

(x∗

P |[tk ,tk+1], pP |[tk ,tk+1]

), ∀k = 0, . . . , N −1. (6.2)

I refer to Section 5.3.1 for an explicit illustration of Equalities (6.1) and (6.2) in a unconstrainedlinear-quadratic context, or to Equalities (6.3) and (6.4) in the next Section 6.3.3 in the context of aclassical parking problem.

93


As a consequence our idea in [B24] in order to get informations on the convergence of the optimalsampled-data control u∗

P is to refine the convexity assumption of Theorem 6.15 in order to derivemoreover the convergence of the corresponding costate pP . To this aim we will make use of theaugmented gradient velocity sets defined by

( f ,L+)∇(x,U, t ) :=(

f (x,u, t ), (∇x f ,∇u f )(x,u, t ),∇u f (x,u, t )u

L(x,u, t )+γ1, (∇x L,∇uL)(x,u, t ),⟨∇uL(x,u, t ),u⟩Rm +γ2

)| (u,γ1,γ2) ∈Ω×R+×R+

,

for all (x, t ) ∈Rn × [0,T ] and we introduce the convexity hypothesis

( f ,L+)∇(x,U, t ) is convex for all (x, t ) ∈Rn × [0,T ]. (H∇conv)

Note that the convexity Hypothesis (H∇conv) is stronger than Hypothesis (Hconv). We are now in a

position to recall the main result of [B24].

Theorem 6.17. Assume that Hypotheses (Hcomp) and (H∇conv) are satisfied. If the solution (x∗,u∗) to

Problem (FEP) (given in Remark 6.5) is unique and has a unique weak extremal lift (p,λ) which isnormal, then there exists ρ > 0 such that, for all P ∈ P satisfying ‖P‖ ≤ ρ, Problem (FEPP ) admitsa solution (x∗

P ,u∗P ) whose, moreover, the P-averaged weak extremal lift (pP ,λP ) is normal. Further-

more, renormalizing the extremal lifts so that λ=λP = 1, it holds that:

(i) x∗P converges uniformly over [0,T ] to x∗;

(ii) C (x∗P ,u∗

P ) converges to C (x∗,u∗);

(iii) pP converges uniformly over [0,T ] to p;

when ‖P‖ tends to zero.

Remark 6.18. Theorem 6.17 can be found in [B24, Theorem 2]. Its proof, which can be foundin [B24, Appendix D Section D], starts in a similar way than the proof of Theorem 6.15 but ismore involved afterwards. Indeed we know that the (averaged or not) weak extremal lifts of anadmissible couple (x,u) are strongly related to the variation vectors obtained under convex L∞-perturbations of the control u (see Section 3.3.5 for some details). The general idea of the proof ofTheorem 6.17 is to deduce the convergence of the costate pP from the convergence of the variationvectors. However the variation vectors are solutions to linear differential equations involving ∇x f ,∇u f , ∇x L, ∇uL, etc. As a consequence, in order to obtain the convergence of the variation vectors,we invoke the Filippov routine, not only on the trajectories as in the proof of Theorem 6.15 (whichrequires only the convexity Hypothesis (Hconv)), but also on the variation vectors which requiresthus the stronger convexity Hypothesis (H∇

conv).

Remark 6.19. The convexity assumption (H∇conv) in Theorem 6.17 is particularly well adapted to

affine-quadratic problems with respect to the control, that is, when the dynamics f and the La-grange cost function L are of the forms

f (x,u, t ) = A(x, t )u +B(x, t ) and L(x,u, t ) = 1

2⟨Y1(t )u,u⟩Rm +⟨Y2(x, t ),u⟩Rm +Y3(x, t ),

for all (x,u, t ) ∈Rn×Rm×[0,T ], where A :Rn×[0,T ] →Rn×m , B :Rn×[0,T ] →Rn , Y1 : [0,T ] →Rm×m ,Y2 :Rn × [0,T ] →Rm and Y3 :Rn × [0,T ] →R are functions of class C2 and such that Y1(t ) ∈S m+ forall t ∈ [0,T ]. If moreover E 6= ;, U is compact and A, B have sublinear growths with respect to theirfirst variable, then Hypothesis (Hcomp) is satisfied.

Remark 6.20. In our work [B24], for simplicity, we took the decision to assume that the func-tions f and L are of class C2. Nevertheless several results previously recalled (such as Pontryaginmaximum principles and Filippov theorems) and even the main results (Theorems 6.15 and 6.17)

94


do not require such regularity assumptions. This assumption is made here in order to applythe Filippov routine in the proof of Theorem 6.17 on the functions involved in the definitionof ( f ,L+)∇(x,U, t ), that is, f , ∇x f , ∇u f , etc. Actually one only needs that these functions are con-tinuous and of class C1 with respect to their first variable.

Remark 6.21. Theorems 6.15 and 6.17 can both be reformulated without any uniqueness hypoth-esis (neither on the solution of Problem (FEP), nor on the weak extremal lift) in terms of accumu-lation points.

Remark 6.22. In the proofs of Theorems 6.15 and 6.17, several weak* convergences in L∞-spacesare also established. For instance it holds that f (x∗

P ,u∗P , ·) converges weakly* in L∞([0,T ],Rn)

to f (x∗,u∗, ·) when the diameter ‖P‖ tends to zero.

Remark 6.23. Several extensions of Theorems 6.15 and 6.17 can be considered as research per-spectives. For instance, as in Chapter 3, one may consider a general terminal state constraint ofthe form ψ(x(0), x(T )) ∈ S. Other possible extensions concern free final time problems, free sam-pling times, state constrained problems, etc.

6.3.3 Illustration of convergence with a classical parking problem

This section is dedicated to an illustration of Theorem 6.17 with the classical parking problem ofminimal energy given by

minimize1

2

∫ T

0u(τ)2 dτ,

subject to x = (x1, x2) ∈ AC([0,T ],R2), u ∈ L∞([0,T ],R),

x(t ) =(

x2(t )u(t )

), a.e. t ∈ [0,T ],

x(0) = (1,0), x(T ) = (0,0),

u(t ) ∈ [−1,1], a.e. t ∈ [0,T ],

(FEPex)

where the final time is fixed such that 2 < T < p6 (for convenience of computations). Consider-

ing the control u(t ) = −1 over [0,1], u(t ) = 1 over [1,2] and u(t ) = 0 over [2,T ], we get that E 6= ;.Moreover, since Problem (FEPex) fits in the framework of Remark 6.19, we deduce that Hypothe-ses (Hcomp) and (H∇

conv) are both satisfied. From the classical Filippov theorem, Problem (FEPex)admits (at least) one solution denoted by (x∗,u∗). From the classical Pontryagin maximum prin-ciple, it admits a strong extremal lift (p,λ). We denote by p = (p1, p2) and one can easily prove bycontradiction that the couple (p,λ) is normal and renormalize it as λ = 1. From the Hamiltoniangradient condition (HG) we get that

u∗(t ) = projU(p2(t )), a.e. t ∈ [0,T ], (6.3)

where projU : Rm → U stands for the classical projection operator onto U = [−1,1]. From there,one can prove that the solution (x∗,u∗) is unique and that (p,λ) is the unique weak extremal liftof (x∗,u∗) (which is normal as we have already mentioned). I refer to [B24, Section III.B] for alldetails.

Hence we are in a position to apply Theorem 6.17. To this aim let us denote by (FEPPex) the same

problem than Problem (FEPex), but by considering a sampled-data control u ∈ PCP ([0,T ],Rm)where P ∈ P is a fixed partition of the interval [0,T ]. From Theorem 6.17, there exists a thresh-old ρ > 0 such that, for all P ∈ P satisfying ‖P‖ ≤ ρ, Problem (FEPP ) admits a solution (x∗

P ,u∗P )

whose, moreover, the P-averaged weak extremal lift (pP ,λP ) is normal that we renormalize sothat λP = 1, and we denote by pP = (pP,1, pP,2).

95


Let P = tk k=0,...,N ∈ PN be a N -partition of the interval [0,T ] satisfying ‖P‖ ≤ ρ. From the aver-aged Hamiltonian gradient condition (AHG), we get that

u∗P,k = projU

(1

tk+1 − tk

∫ tk+1

tk

pP,2(τ) dτ

), ∀k = 0, . . . , N −1. (6.4)

From Theorem 6.17, we know that pP uniformly converges over [0,T ] to p when the diameter ‖P‖tends to zero. From Equalities (6.3) and (6.4), we easily deduce that the optimal sampled-datacontrol u∗

P converges pointwisely over [0,T ] to the optimal permanent control u∗ when ‖P‖ tendsto zero. This convergence is depicted in Figure 6.1 by considering T = 2.15 and uniform parti-tions P ∈PN with different values of N ∈N∗. The graphics have been performed by solving Prob-lem (FEPP

ex) with a basic direct numerical method implemented with MATLAB software.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-1.5

-1

-0.5

0

0.5

1

1.5N=2

optimal permanent control

optimal sampled-data control

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-1.5

-1

-0.5

0

0.5

1

1.5N=4



0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-1.5

-1

-0.5

0

0.5

1

1.5N=20



0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-1.5

-1

-0.5

0

0.5

1

1.5N=50



Figure 6.1 – Pointwise convergence of the optimal sampled-data control u∗P to the optimal permanent con-

trol u∗ in a classical parking problem when N tends to +∞ (and thus the diameter ‖P‖ of the uniformN -partition P of the interval [0,T ] tends to zero).

96


[B12] L. Bourdin and E. Trélat. Pontryagin maximum principle for finite dimensional nonlinearoptimal control problems on time scales. SIAM J. Control Optim., 51(5):3781–3813, 2013. 33,34, 35, 36, 38, 39, 40, 42, 43, 44, 45, 46, 47, 48, 49, 50, 66

[B13] L. Bourdin and E. Trélat. General Cauchy–Lipschitz theory for ∆-Cauchy problems withCarathéodory dynamics on time scales. J. Difference Equ. Appl., 20(4):526–547, 2014. 33, 35,45, 46, 48

[B14] L. Bourdin. Nonshifted calculus of variations on time scales with ∇-differentiableσ. J. Math.Anal. Appl., 411(2):543–554, 2014. 36

[B15] L. Bourdin and E. Trélat. Pontryagin maximum principle for optimal sampled-data controlproblems. In Proceedings of the 16th IFAC Workshop on Control Applications of Optimization(CAO), Garmisch-Partenkirchen (Germany), 2015. 42, 44, 78, 81, 84, 90

[B16] L. Bourdin and E. Trélat. Optimal sampled-data control, and generalizations on time scales.Math. Control Relat. Fields, 6(1):53–94, 2016. 33, 34, 36, 37, 39, 40, 41, 42, 43, 44, 45, 46, 48,49, 50, 51, 78, 81, 84, 90

[B17] L. Bourdin, O. Stanzhytskyi, and E. Trélat. Addendum to “Pontryagin’s maximum principlefor dynamic systems on time scales". J. Difference Equ. Appl., 23(10):1760–1763, 2017. 48

[B18] L. Bourdin and E. Trélat. Linear-quadratic optimal sampled-data control problems: conver-gence result and Riccati theory. Automatica J. IFAC, 79:273–281, 2017. 67, 69, 77, 78, 81, 84,88

[B19] L. Bourdin and G. Dhar. Continuity/constancy of the Hamiltonian function in a Pontryaginmaximum principle for optimal sampled-data control problems with free sampling times.Math. Control Signals Systems, 31(4):503–544, 2019. 61, 62, 63, 64, 65, 66, 67

[B20] T. Bakir, B. Bonnard, L. Bourdin, and J. Rouot. Pontryagin-type conditions for optimal mus-cular force response to functional electrical stimulations. J. Optim. Theory Appl. (to appear),2019. 61, 62, 63, 70, 72, 73, 74

[B21] L. Bourdin and G. Dhar. Optimal sampled-data controls with running inequality state con-straints - Pontryagin maximum principle and bouncing trajectory phenomenon. Submitted,2019. 34, 37, 42, 51, 54, 55, 56, 74

[B22] P. Bettiol and L. Bourdin. Pontryagin maximum principle for state constrained optimalsampled-data control problems on time scales. Submitted, 2020. 34, 37, 38, 39, 40, 42, 43,46, 49, 50, 51, 57, 58

[B23] L. Bourdin and E. Trélat. Unified Riccati theory for optimal permanent and sampled-datacontrol problems in finite and infinite time horizons. Submitted, 2020. 77, 78, 79, 80, 81, 83,84, 85, 86

[B24] L. Bourdin and E. Trélat. Convergence in nonlinear optimal sampled-data control problemswith fixed endpoint. Work in progress (submitted soon), 2020. 85, 87, 88, 89, 90, 91, 92, 93,94, 95



[100] J. E. Ackermann. Sampled-data control systems: analysis and synthesis, robust system design.Springer-Verlag, 1985. 36

[101] R. Agarwal, M. Bohner, and A. Peterson. Inequalities on time scales: a survey. Math. Inequal.Appl., 4(4):535–557, 2001. 35

[102] R. P. Agarwal and M. Bohner. Basic calculus on time scales and some of its applications.Results Math., 35(1-2):3–22, 1999. 35

[103] R. P. Agarwal, V. Otero-Espinar, K. Perera, and D. R. Vivero. Basic properties of Sobolev’sspaces on time scales. Adv. Difference Equ., pages Art. ID 38121, 14, 2006. 35, 38

[104] P. Antoine and H. Zouaki. Étude locale de l’ensemble des points critiques d’un problèmed’optimisation paramétré. C. R. Acad. Sci. Paris Sér. I Math., 310(7):587–590, 1990. 88, 91, 92

[105] K. J. Aström. On the choice of sampling rates in optimal linear systems. IBM Research:Engineering Studies, 1963. 78, 81, 86

[106] K. J. Aström and B. Wittenmark. Computer-Controlled Systems. Prentice Hall, 1997. 36

[107] B. Aulbach and L. Neidhart. Integration on measure chains. In Proceedings of the Sixth In-ternational Conference on Difference Equations, pages 239–252. CRC, Boca Raton, FL, 2004.38

[108] T. Bakir. Contribution à la modélisation, l’estimation et la commande de systèmes nonlinéaires dans les domaines de la cristallisation et de l’électrostimulation musculaire. Ha-bilitation à diriger des recherches (hdr), University of Dijon (France), 2018. 62

[109] T. Bakir, B. Bonnard, and J. Rouot. A case study of optimal input-output system withsampled-data control: Ding et al. force and fatigue muscular control model. Netw. Heterog.Media, 14(1):79–100, 2019. 74

[110] P. Bettiol and H. Frankowska. Hölder continuity of adjoint states and optimal controls forstate constrained problems. Appl. Math. Optim., 57(1):125–147, 2008. 35, 55

[111] E. Bini. Design of optimal control systems. PhD thesis, University of Pisa, Italy, 2009. Tesi diLaurea Specialistica. 78, 81

[112] E. Bini and G. Buttazzo. The optimal sampling pattern for linear control systems. IEEE Trans.Automat. Control, 59(1):78–90, 2014. 83

[113] M. Bohner. Calculus of variations on time scales. Dynam. Systems Appl., 13(3-4):339–349,2004. 36


[114] M. Bohner, K. Kenzhebaev, O. Lavrova, and O. Stanzhytskyi. Pontryagin’s maximum prin-ciple for dynamic systems on time scales. J. Difference Equ. Appl., 23(7):1161–1189, 2017.48

[115] M. Bohner and A. Peterson. Dynamic equations on time scales. Birkhäuser Boston, Inc.,Boston, MA, 2001. An introduction with applications. 35, 37, 38, 47

[116] M. Bohner and A. Peterson. Advances in dynamic equations on time scales. BirkhäuserBoston, Inc., Boston, MA, 2003. 35, 37, 38

[117] V. G. Boltyanskiı. Optimal control of discrete systems. John Wiley & Sons, New York-Toronto,Ont.; Israel Program for Scientific Translations, Jerusalem, 1978. A Halsted Press Book,Translated from the Russian by Ron Hardin. 35, 42, 44

[118] V. G. Boltyanskiı and A. S. Poznyak. The robust maximum principle. Systems & Control:Foundations & Applications. Birkhäuser/Springer, New York, 2012. Theory and applications.45

[119] F. J. Bonnans, P. Martinon, and V. Grélard. Bocop - a collection of examples. Research ReportRR-8053, INRIA, 2012. 63, 73

[120] J. F. Bonnans and C. Sánchez-Fernández de la Vega. Optimal control of state constrainedintegral equations. Set-Valued Var. Anal., 18(3-4):307–326, 2010. 49

[121] B. Bonnard, L. Faubourg, G. Launay, and E. Trélat. Optimal control with state constraintsand the space shuttle re-entry problem. J. Dynam. Control Systems, 9(2):155–199, 2003. 35,55

[122] A. Bressan and B. Piccoli. Introduction to the mathematical theory of control, volume 2 ofAIMS Series on Applied Mathematics. American Institute of Mathematical Sciences (AIMS),Springfield, MO, 2007. 16, 26, 34, 80, 82, 83, 86

[123] A. Cabada and D. R. Vivero. Criterions for absolute continuity on time scales. J. DifferenceEqu. Appl., 11(11):1013–1028, 2005. 38

[124] A. Cabada and D. R. Vivero. Expression of the Lebesgue ∆-integral on time scales as a usualLebesgue integral: application to the calculus of ∆-antiderivatives. Math. Comput. Mod-elling, 43(1-2):194–207, 2006. 37, 38

[125] N. L. Carothers. Real analysis. Cambridge University Press, Cambridge, 2000. 55

[126] T. Chen and B. Francis. Optimal sampled-data control systems. Springer-Verlag London,Ltd., London, 1996. 36

[127] O. Cots. Contrôle optimal géométrique : méthodes homotopiques et applications. PhD thesis,University of Dijon (France), 2012. 63, 73

[128] O. Cots. Geometric and numerical methods for a state constrained minimum time controlproblem of an electric vehicle. ESAIM Control Optim. Calc. Var., 23(4):1715–1749, 2017. 35

[129] R. F. Curtain and A. J. Pritchard. The infinite-dimensional Riccati equation. J. Math. Anal.Appl., 47:43–57, 1974. 78

[130] B. N. Datta. Numerical methods for linear control systems. Elsevier Academic Press, SanDiego, CA, 2004. Design and analysis, With 1 CD-ROM (Windows, Macintosh and UNIX). 82

[131] J. Ding, S. A. Binder-Macleod, and A. Wexler. Two-step, predictive, isometric force modeltested on data from human and rat muscles. Journal of applied physiology, 85:2176–2189,1998. 62, 70


[132] J. Ding, A. Wexler, and S. A. Binder-Macleod. Development of a mathematical model thatpredicts optimal muscle activation patterns by using brief trains. Journal of applied physi-ology, 88:917–925, 2000. 62, 70

[133] J. Ding, A. Wexler, and S. A. Binder-Macleod. A predictive model of fatigue in human skeletalmuscles. Journal of applied physiology, 89:1322–1332, 2000. 62

[134] J. Ding, A. Wexler, and S. A. Binder-Macleod. Mathematical models for fatigue minimiza-tion during functional electrical stimulation. Journal of electromyography and kinesiology,13:575–588, 2003. 62, 70, 71

[135] A. V. Dmitruk. On the development of Pontryagin’s maximum principle in the works of A.Ya. Dubovitskii and A. A. Milyutin. Control Cybernet., 38(4A):923–957, 2009. 35, 57

[136] A. V. Dmitruk and A. M. Kaganovich. Maximum principle for optimal control problems withintermediate constraints. Comput. Math. Model., 22(2):180–215, 2011. Translation of Ne-lineınaya Din. Upr. No. 6 (2008), 101–136. 66

[137] P. Dorato and A. H. Levis. Optimal linear regulators: the discrete-time case. IEEE Trans.Automatic Control, AC-16:613–620, 1971. 78, 81, 83, 86

[138] I. Ekeland. On the variational principle. J. Math. Anal. Appl., 47:324–353, 1974. 16, 17, 27,43, 48, 65

[139] L. C. Evans. An introduction to mathematical optimal control theory. In Lecture notes, 1983.Version 0.2. 56

[140] M. S. Fadali and A. Visioli. Digital control engineering: Analysis and design. Elsevier, 2013.36

[141] H. O. Fattorini. Infinite-dimensional optimization and control theory, volume 62 of Encyclo-pedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 1999.29, 45, 62

[142] R. A. C. Ferreira and D. F. M. Torres. Higher-order calculus of variations on time scales. InMathematical control theory and finance, pages 149–159. Springer, Berlin, 2008. 36

[143] A. Filippov. On some questions in the theory of optimal regulation: existence of a solutionof the problem of optimal regulation in the class of bounded measurable functions. VestnikMoskov. Univ. Ser. Mat. Meh. Astr. Fiz. Him., 1959(2):25–32, 1959. 16, 26, 88, 90

[144] A. Fryszkowski. Fixed point theory for decomposable sets, volume 2 of Topological Fixed PointTheory and Its Applications. Kluwer Academic Publishers, Dordrecht, 2004. 49

[145] S. Fukata and M. Takata. On sampling period sensitivities of the optimal stationary sampled-data linear regulator. Internat. J. Control, 29(1):145–158, 1979. 86

[146] R. Gesztelyi, J. Zsuga, A. Kemeny-Beke, B. Varga, B. Juhász, and A. Tósaki. The Hill equationand the origin of quantitative pharmacology. Archive for History of Exact Sciences, 66:427–438, 2012. 62

[147] I. V. Girsanov. Lectures on mathematical theory of extremum problems. Springer-Verlag,Berlin-New York, 1972. Edited by B. T. Poljak, Translated from the Russian by D. Louvish,Lecture Notes in Economics and Mathematical Systems, Vol. 67. 35

[148] K. A. Grasse. On the relation between small-time local controllability and normal self-reachability. Math. Control Signals Systems, 5(1):41–66, 1992. 92


[149] K. A. Grasse. Reachability of interior states by piecewise constant controls. Forum Math.,7(5):607–628, 1995. 92

[150] K. A. Grasse and H. J. Sussmann. Global controllability by nice controls. In Nonlinear con-trollability and optimal control, volume 133 of Monogr. Textbooks Pure Appl. Math., pages33–79. Dekker, New York, 1990. 92

[151] G. S. Guseinov. Integration on time scales. J. Math. Anal. Appl., 285(1):107–127, 2003. 38

[152] W. W. Hager. Rates of convergence for discrete approximations to unconstrained controlproblems. SIAM J. Numer. Anal., 13(4):449–472, 1976. 84

[153] W. W. Hager. Runge-kutta methods in optimal control and the transformed adjoint system.Numer. Math., 87(2):247–282, 2000. 84

[154] H. Halkin. A maximum principle of the Pontryagin type for systems described by nonlineardifference equations. SIAM J. Control, 4:90–111, 1966. 35

[155] R. F. Hartl, S. P. Sethi, and R. G. Vickson. A survey of the maximum principles for optimalcontrol problems with state constraints. SIAM Rev., 37(2):181–218, 1995. 35, 42, 44, 52, 54,55

[156] M. R. Hestenes. Calculus of variations and optimal control theory. Robert E. Krieger Pub-lishing Co., Inc., Huntington, N.Y., 1980. Corrected reprint of the 1966 original. 16, 34

[157] S. Hilger. Analysis on measure chains – A unified approach to continuous and discrete cal-culus. Results Math., 18(1-2):18–56, 1990. 35

[158] R. Hilscher and V. Zeidan. Calculus of variations on time scales: weak local piecewise C 1rd

solutions with variable endpoints. J. Math. Anal. Appl., 289(1):143–166, 2004. 36

[159] R. Hilscher and V. Zeidan. Weak maximum principle and accessory problem for controlproblems on time scales. Nonlinear Anal., 70(9):3209–3226, 2009. 36

[160] J. M. Holtzman. Convexity and the maximum principle for discrete systems. IEEE Trans.Automatic Control, AC-11:30–35, 1966. 35

[161] J. M. Holtzman and H. Halkin. Directional convexity and the maximum principle for discretesystems. SIAM J. Control, 4:263–275, 1966. 35, 44

[162] A. Huseynov. The Riesz representation theorem on time scales. Math. Comput. Modelling,55(3-4):1570–1579, 2012. 38

[163] D. H. Jacobson, M. M. Lele, and J. L. Speyer. New necessary conditions of optimality forcontrol problems with state-variable inequality constraints. J. Math. Anal. Appl., 35:255–284, 1971. 35, 55

[164] R. E. Kalman and R. W. Koepcke. Optimal synthesis of linear sampling control systems usinggeneralized performande inedxes. 80:1820–1826, 1958. 78, 81

[165] D. Kleinman and M. Athans. The Discrete Minimum Principle with Application to the Lin-ear Regulator Problem. Report (Massachusetts Institute of Technology. Electronic SystemsLaboratory). M.I.T. Electronic Systems Laboratory, 1966. 78

[166] V. Kucera. The discrete Riccati equation of optimal control. Kybernetika (Prague), 8:430–447,1972. 78

[167] H. Kwakernaak and R. Sivan. Linear optimal control systems. Wiley-Interscience [John Wiley& Sons], New York-London-Sydney, 1972. 78, 80, 82


[168] I. D. Landau and G. Zito. Digital Control Systems: Design, Identification and Implementation.Springer, 2006. 36

[169] L. A. F. Law and R. K. Shields. Mathematical models of human paralyzed muscle after long-term training. Journal of biomechanics, 40:2587–2595, 2007. 62, 70

[170] E. B. Lee and L. Markus. Foundations of optimal control theory. Robert E. Krieger PublishingCo., Inc., Melbourne, FL, second edition, 1986. 16, 34, 78, 80, 82, 83, 86

[171] A. H. Levis and R. A. Schlueter. On the behaviour of optimal linear sampled-data regulators.International Journal of Control, 13(2):343–361, 1971. 62, 78, 83, 86

[172] X. J. Li and J. M. Yong. Optimal control theory for infinite-dimensional systems. Systems &Control: Foundations & Applications. Birkhäuser Boston, Inc., Boston, MA, 1995. 16, 42, 49,50

[173] Y. Li and Y. Chen. Fractional order linear quadratic regulator. pages 363 – 368, 11 2008. 78

[174] N. Martins and D. F. M. Torres. Calculus of variations on time scales with nabla derivatives.Nonlinear Anal., 71(12):e763–e773, 2009. 36

[175] H. Maurer. On optimal control problems with bounded state variables and control appear-ing linearly. SIAM J. Control Optim., 15(3):345–362, 1977. 35, 55

[176] H. Maurer, J. R. Kim, and G. Vossen. On A State-Constrained Control Problem in OptimalProduction and Maintenance, volume 7 of Deissenberg C., Hartl R.F. (eds) Optimal Controland Dynamic Games. Advances in Computational Management Science. Springer, Boston,MA, 2005. 35

[177] S. M. Melzer and B. C. Kuo. Sampling period sensitivity of the optimal sampled data linearregulator. Automatica J. IFAC, 7:367–370, 1971. 62, 78, 83, 86

[178] R. H. Middleton and G. C. Goodwin. Digital control and estimation: A unified approach.1990. 78, 83

[179] B. S. Mordukhovich. Variational analysis and generalized differentiation. I, volume 330 ofGrundlehren der Mathematischen Wissenschaften [Fundamental Principles of MathematicalSciences]. Springer-Verlag, Berlin, 2006. Basic theory. 50

[180] D. Nešic, A. R. Teel, and P. V. Kokotovic. Sufficient conditions for stabilization of sampled-data nonlinear systems via discrete-time approximations. Systems Control Lett., 38(4-5):259–270, 1999. 84

[181] A. Polyanin and V. Zaitsev. Handbook of Exact Solutions for Ordinary Differential Equations.Chapman and Hall/CRC, 2018. 86

[182] L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko. The mathemat-ical theory of optimal processes. Translated from the Russian by K. N. Trirogoff; edited by L.W. Neustadt. Interscience Publishers John Wiley & Sons, Inc. New York-London, 1962. 34,42, 89

[183] B. N. Pshenichnyi. Necessary conditions for an extremum, volume 4 of Translated from theRussian by Karol Makowski. Translation edited by Lucien W. Neustadt. Pure and AppliedMathematics. Marcel Dekker, Inc., New York, 1971. 35, 42

[184] C. Rackauckas and Q. Nie. DifferentialEquations.jl - A performant and feature-rich ecosys-tem for solving differential equations in Julia. The Journal of Open Research Software, 5(1),2017. 73


[185] J. Ragazzini and G. Franklin. Sampled-data control systems. McGraw-Hill series in controlsystems engineering. McGraw-Hill, 1958. 36

[186] W. Rudin. Real and complex analysis. McGraw-Hill Book Co., New York, third edition, 1987.50

[187] M. Salgado, R. Middleton, and G. C. Goodwin. Connection between continuous and discreteRiccati equations with applications to Kalman filtering. Proc. IEE-D, 135(1):28–34, 1988. 78

[188] R. A. Schlueter. The optimal linear regulator with constrained sampling times. IEEE Trans.Automatic Control, AC-18(no.5):515–518, 1973. 62

[189] R. A. Schlueter and A. H. Levis. The optimal linear regulator with state-dependent sampling.IEEE Trans. Automatic Control, AC-18(5):512–515, 1973. 62

[190] S. P. Sethi and G. L. Thompson. Optimal control theory. Kluwer Academic Publishers,Boston, MA, second edition, 2000. Applications to management science and economics.16, 34, 35, 52

[191] W. Sierpinski. Sur les fonctions d’ensemble additives et continues. Fundamenta Mathemat-icae, 3:240–246, 1922. 49

[192] E. D. Sontag. Mathematical control theory, volume 6 of Texts in Applied Mathematics.Springer-Verlag, New York, second edition, 1998. Deterministic finite-dimensional systems.78, 80, 82

[193] J. H. Sussmann. Reachability by means of nice controls. In Proceedings of the 26th IEEEConference on Decision and Control, pages 1368–1373, 1987. 92

[194] W. J. Terrell. Stability and stabilization. Princeton University Press, Princeton, NJ, 2009. Anintroduction. 82

[195] E. Trélat. Contrôle optimal. Mathématiques Concrètes. [Concrete Mathematics]. Vuibert,Paris, 2005. Théorie & applications. [Theory and applications]. 16, 34, 55, 80, 82, 83, 86

[196] T. van Keulen, J. Gillot, B. de Jager, and M. Steinbuch. Solution for state constrained optimalcontrol problems applied to power split control for hybrid vehicles. Automatica J. IFAC,50(1):187–192, 2014. 35

[197] R. Vinter. Optimal control. Modern Birkhäuser Classics. Birkhäuser Boston, Inc., Boston,MA, 2010. Paperback reprint of the 2000 edition. 16, 34, 42, 44, 57

[198] G. Weiss and R. Rebarber. Optimizability and estimatability for infinite-dimensional linearsystems. SIAM J. Control Optim., 39(4):1204–1232, 2000. 82

[199] M. Yochum. Contribution à la conception d’un électromyostimulateur intelligent. PhD the-sis, University of Dijon (France), 2013. 62

[200] J. Yuz. Sampled-data models for linear and nonlinear systems. PhD thesis, University ofNewcastle, Australia, 2005. 78

[201] J. Zabczyk. Mathematical control theory. Modern Birkhäuser Classics. Birkhäuser Boston,Inc., Boston, MA, 2008. An introduction, Reprint of the 1995 edition. 82

[202] Z. Zhan, S. Chen, and W. Wei. A unified theory of maximum principle for continuous anddiscrete time optimal control problems. Math. Control Relat. Fields, 2(2):195–215, 2012. 36

[203] J. Zhu. On stochastic Riccati equations for the stochastic LQR problem. Systems ControlLett., 54(2):119–124, 2005. 78

Part III

Contributions to variational analysis inview of shape optimization problems in

contact mechanics

Chapter 7

Flip procedure in geometricapproximation of multiple-componentshapesFirst steps in shape optimization theory


7.2 Intersecting control polygons detection and flip procedure . . . . . . . . . . . . . 99

7.2.1 Basics on piecewise Bézier curves . . . . . . . . . . . . . . . . . . . . . . . . . 100

7.2.2 A brief overview of the flip procedure introduced in [B25] . . . . . . . . . . . 101

7.3 Application to multiple-inclusion detection . . . . . . . . . . . . . . . . . . . . . . 102

7.3.1 Problem setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

7.3.2 Computation of the shape gradient . . . . . . . . . . . . . . . . . . . . . . . . 103

7.3.3 Numerical simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7.4 Concluding comments and perspectives . . . . . . . . . . . . . . . . . . . . . . . . 106

When I was a PhD student at the University of Pau, I shared my office with another PhD student Fa-bien Caubet who was studying obstacle detection problems with the help of shape optimizationtools. Through our numerous discussions, we were eager at that time to develop collaborationstogether in that field. Since September 2014, I organize the weekly seminar of the MOD team(Modeling, Optimization and Dynamics) at the University of Limoges. In October 2014, I invitedFabien Caubet to give an oral presentation of his works where one aims to reconstruct numericallya (one-component) obstacle ωex living in a larger two-dimensional bounded domain Ω⊂R2 fromboundary measurements on ∂Ω (see Section 7.3 for more details). His numerical reconstructionswere based, in particular, on a parameterization of the boundary using truncated Fourier series.It turns out that Pierre Bonnelie, who was a PhD student at the University of Limoges, and his su-pervisor Olivier Ruatta were working on shape optimization problems involved in the conceptionof bandpass microwave filter [215] by using, in particular, Bézier parameterizations. The four ofus have discussed and realized that Bézier parameterizations could be suitable in order to detectmultiple-component inclusion with no a priori knowledge on the number of components (whilethe parameterization based on truncated Fourier series is not, see Remark 7.4). Fabien Caubet andmyself took this occasion in order to start a collaboration together jointly with Pierre Bonnelie andOlivier Ruatta. The present chapter summarizes the contribution of the following publication:

• [B25]: P. Bonnelie, L. Bourdin, F. Caubet and O. Ruatta. Flip procedure in geometric approxi-mation of multiple-component shapes – Application to multiple-inclusion detection. SMAIJ. Comput. Math., 2:255–276, 2016.

97

CHAPTER 7. FLIP PROCEDURE IN GEOMETRIC APPROXIMATION OFMULTIPLE-COMPONENT SHAPES

7.1 Introduction

Geometric shape approximation methods are commonly based on successive shape deforma-tions, where the boundary of the approximated shape is parameterized and evolves at each stepin a direction given by the deformation flow. This technique is widely used for example in shapeoptimization problems where the flow is given by the so-called shape gradient (see, e.g., Chapter 5of the book [240] by Henrot and Pierre), or in image segmentation (see, e.g., [245]). Numerous pa-rameterizations of the boundary have been considered in the literature, such as polygons, Fourierseries, etc. Each of these parameterizations has its own advantages and drawbacks, that dependon the nature of the problem studied.

In the paper [B25], written in collaboration with Bonnelie and Ruatta from the University of Limo-ges and Caubet from the University of Pau, we were especially interested in the geometric ap-proximation of multiple-component shapes, in particular when the number of components is apriori unknown. Starting a parameterization method with a one-component initial shape in orderto approximate a multiple-component target shape usually leads the deformation flow to makethe boundary evolve until it surrounds all the components of the target shape (see Figure 7.1 forillustrations). This classical phenomenon tends to create double points on the boundary of theapproximated shape.

Target shape

Approximated shape

(a) Two-dimensional case (b) Three-dimensional case

Figure 7.1 – Illustrations of geometric shape approximation of a two-component target shape starting froma one-component initial shape.

In order to improve the approximation of multiple-component shapes, our idea in [B25] was tolook for an appropriate parameterization that allows to achieve two numerical tasks:

(i) firstly, the parameterization has to be well-suited in order to prevent the potential formationof double points, i.e. to locate the parts of the boundary that are close to each other;

(ii) secondly, it has to be adapted in order to easily change the topology of the approximatedshape, precisely in order to divide a one-component shape into a two-component shape.

For these purposes, we proposed in [B25] a method based on a Bézier parameterization. The mainidea is that this polynomial parameterization can be approximated by its control polygon. Thusone can easily prevent the formation of double points by looking for intersecting control poly-gons. Morever, once this first step is achieved, one can reorganize the control points of the Bézierparameterization in order to modify the topology of the shape, precisely in order to divide onecomponent into two. In the paper [B25], the above method is detailed in the two-dimensionalcase, using piecewise Bézier curves. I refer to Section 7.2 for details on the so-called intersectingcontrol polygons detection and flip procedure. I take this occasion to mention that the extensionof these two procedures to the three-dimensional case remains an open challenge with specificdifficulties, in particular in the reorganization of the control points for the flip procedure.

In order to illustrate these two procedures, we performed in [B25] numerical simulations on atwo-dimensional obstacle detection problem. Precisely let us consider the inverse problem of de-tecting some unknown inclusionωex in a larger bounded domainΩ⊂R2 from boundary measure-ments made on ∂Ω. The aim is to reconstruct numerically a geometric approximation of the target

98


shape ωex using shape optimization tools (see Figure 7.2 for an illustration). In the paper [B25] westudied this inverse problem by minimizing a shape least squares functional.

Exterior boundary ∂Ω

Target shape (or exact shape) ωex

Initial approximated shape

Final approximated shape

Figure 7.2 – Illustration of reconstruction for a two-dimensional obstacle detection problem.

Remark 7.1. Before going any further, I develop in this remark some brief reminders on the ma-jor shape optimization techniques used in the literature. The two main categories are topologicaland geometric shape optimization methods. The topological gradient approach was introducedby Schumacher in [279] and Sokołowski et al. in [289]. This method is based on asymptotic expan-sions and consequently is essentially adapted for relatively small inclusions. Moreover, even if thetopological optimization is useful in order to find the number of components, it may be not well-suited in order to find a satisfactory approximation of the shape of the inclusion (see, e.g., [222]and references therein). I refer to [207] and references therein for a comprehensive mathematicaltreatment with theoretical and numerical results about reconstruction of small inclusions fromboundary measurements. In the geometric shape optimization category, two main techniques areaddressed in the literature. They are both based on the computation of a shape gradient used asa flow making the shape evolve. These two methods use different representations of the shapeand different techniques to deform it. The first approach is the so-called level set approach (see,e.g., the survey [221] of Burger et al. and references therein or [282]). It is originally based on animplicit representation of the approximated shape on a fixed mesh. In order to detect a multiple-component inclusion, this method does not need any a priori knowledge on the number of com-ponents. The second approach is based on boundary variations via mesh variations and, in thecase of inverse problems, on an explicit representation of the approximated shape. This methodis used e.g. in the work [205] of Afraites et al. Note that the standard algorithm based on shapederivatives moving the mesh does not provide the opportunity to change the topology of the shapeand consequently the number of components has to be known in advance. Finally recent workspropose to mix several of the above different approaches but it is not my aim to provide a state ofthe art here.

The method used in our paper [B25] is based only on mesh variation techniques. The parameteri-zation by piecewise Bézier curves and the flip procedure permit to dynamically change the topol-ogy of the shape in order to find the number of components, and the shape derivatives approachallows to approximate the shape of the inclusion with an explicit representation. This originalmethod seems to be well-suited in order to study the two-dimensional obstacle detection problemabove mentioned, in particular in the case where the number of components is a priori unknown.

7.2 Intersecting control polygons detection and flip procedure

We first give in Section 7.2.1 some basic recalls on piecewise Bézier curves (see, e.g., [234, 281]for more details). Then we provide in Section 7.2.2 a brief overview of the intersecting control

99


polygons detection and of the flip procedure, both introduced in the paper [B25]. It is not my aim inthis chapter to provide a precise description of these two procedures: I refer the interested readerto [B25, Section 3] for a detailed presentation.

7.2.1 Basics on piecewise Bézier curves

Let d ∈ N∗ and a set of d +1 points P0, . . . ,Pd of R2. The associated Bézier curve B([P0, . . . ,Pd ]) isdefined by

∀t ∈ [0,1], B([P0, . . . ,Pd ], t ) :=d∑

j=0P j

(d

j

)t j (1− t )d− j ,

involving the classical Bernstein polynomials. The positive integer d is the degree of the curve andthe points P0, . . . ,Pd are its control points (or its control polygon). Note that a Bézier curve does notgo through its control points in general. However it starts at P0 and finishes at Pd . If P0 = Pd , theBézier curve is said to be closed. Each point of a Bézier curve is a convex combination of its controlpoints. Hence a Bézier curve lies in the convex hull of its control polygon (see Figure 7.3).

P0

P1

P2

P3

P4

Figure 7.3 – A non-closed Bézier curve of degree 4 lying in the convex hull of its control polygon.

Using a single closed Bézier curve in order to approximate a two-dimensional shape is not anefficient method for several reasons. Indeed, in order to approximate a shape with a lot of geo-metric features, one would need to increase the number of degrees of freedom, i.e. the number ofcontrol points. However, as is very well known, increasing the degree of an approximating poly-nomial curve leads to a classical oscillation phenomenon and, in the particular case of a Bézierpolynomial curve, it leads to numerical instabilities (due to the ill-conditionness of the Bernstein–Vandermonde matrices, see [252]). Moreover, since each control point has a global influence onthe curve, one could not handle local complexities of a shape with a single Bézier curve. The classi-cal idea is then to divide the curve in several Bézier curves of small degrees. This leads us to the fol-lowing definition of piecewise Bézier curves. Let N ∈N∗, d ∈N∗ and a set of N (d+1) control pointsP1,0, . . . ,P1,d , . . . ,PN ,d of R2 satisfying the continuity relations Pi ,d = Pi+1,0 for every i = 1, . . . , N −1.The associated piecewise Bézier curve, denoted (abusively) by B([P1,0, . . . ,PN ,d ]), is defined by

∀t ∈ [0,1], B([P1,0, . . . ,PN ,d ], t ) := B([Pi ,0, . . . ,Pi ,d ], N t − i +1), if t ∈[

i −1

N,

i

N

], i = 1, . . . , N .

The global curve is then composed of N Bézier curves called patches. A piecewise Bézier curvegoes through Pi ,0 and Pi ,d for all i = 1, . . . , N and is said to be closed if P1,0 = PN ,d .

Remark 7.2. In practice we use cubic patches (d = 3) because they are sufficient in order to recovermany geometrical situations, such as inflexion points (see Figure 7.4).

Adapting the proof of the Stone–Weierstrass theorem, one can prove the following result whichcorresponds to a particular case of the Bishop theorem [214] and which fully justifies the use ofpiecewise Bézier curves in order to approximate two-dimensional bounded shapes.

Proposition 7.3. Let f ∈ C([0,1],R2). For all ε > 0 and all d ∈ N∗, there exist N ∈ N∗ and a setof N (d+1) control points P1,0, . . . ,P1,d , . . . ,PN ,d , satisfying the continuity relations, such that ‖ f (t )−B([P1,0, . . . ,PN ,d ], t )‖R2 ≤ ε for all t ∈ [0,1].

100


Figure 7.4 – A closed piecewise Bézier curve composed of seven cubic patches.

Remark 7.4. Recall that the use of polar coordinates, where the radius is expanded in a truncatedFourier series, is another common and efficient strategy in order to approximate two-dimensionalshapes (see, e.g., [205] in the context of inclusion detection). However it has two main drawbacks.Firstly it allows to represent only star-shaped domains and secondly, due to a classical oscillationphenomenon, it cannot represent rigorously straight lines (see, e.g., [224, Figure 5] in the contextof inclusion detection). The use of piecewise Bézier curves constitutes an alternative in order toapproximate non star-shaped domains and straight lines (see Section 7.3.3 for some numericalsimulations in the context of inclusion detection). To conclude this remark, recall that the flip pro-cedure, which is the main topic of the paper [B25], is based on the detection of potential collisionsbetween two parts of the boundary of the approximated shape (see Section 7.2.2 for more details).Thus, it is worth precising that a parameterization based on polar coordinates, where the radius isexpanded in a truncated Fourier series, is not adapted to prevent such collisions.

7.2.2 A brief overview of the flip procedure introduced in [B25]

As mentioned in Introduction we were interested in [B25] in two-dimensional shape approxima-tion problems in which the target shape can have multiple connected components but the numberof components is a priori unknown. Our major aim was to provide a simple and original concept(called flip procedure) that can be added to any two-dimensional shape approximation algorithmbased on piecewise cubic Bézier curves, and which allows to change the topology of the approx-imated shape. Precisely the flip procedure allows to divide a one-component shape into a two-component shape. In this section, I provide a brief overview of this technique.

Consider a general two-dimensional shape approximation algorithm in which the boundary ofthe approximated shape is parameterized by a piecewise cubic Bézier curve. It starts from a one-component initial shape ω0 and produces a sequence of one-component shapes (ωk )k≥0 by de-forming the boundary at each step. The idea proposed in [B25] consists in two phases (that aresummarized in Figure 7.5):

(i) Check, at each step of the approximation algorithm, if the current shapeωk is in the situationdepicted in Figure 7.1(a), that is, if two parts of the boundary are very close to each other.The parameterization by a piecewise cubic Bézier curve allows to prevent such a situation bylooking for intersecting control polygons (see Remark 7.5 below for details). This procedureis called intersecting control polygons detection.

(ii) If some control polygons intersect each other, we apply the flip procedure in order to obtaina two-component shape by keeping unchanged all other control polygons. This procedureconsists in a reorganization of the control points, building two new control polygons out of(two or three) intersecting control polygons. I refer to [B25, Section 3.3] for the details.

It is not my aim in this chapter to provide a precise and long description of the complete pro-cedure. I refer the interested reader to [B25, Sections 3.2 and 3.3] for a detailed presentation and

101


Shape ωk

Scan forintersecting

control polygons

Two intersectingcontrol polygons

Flip

two-componentshape ωk+1

Figure 7.5 – Overview of the complete procedure.

more comments, in particular on the possible difficulties encountered in practice and on the even-tual tricks allowing to overcome them (such as the split and merge functions which permit to con-trol the size of the patches during the iterations of the algorithm, see [B25, Remark 3.3]).

Remark 7.5. This remark concerns the intersecting control polygons detection. Checking if eachcontrol polygon intersects another one may be very expensive in terms of computations. Axis-Aligned Bounding Boxes (AABBs) are a very common tool in Computer Graphics and Computa-tional Geometry in order to detect the collision of two objects (see, e.g., [232]), with a relatively lowcomputational cost. AABB is defined as the smallest rectangle, whose sides are aligned with theaxes, containing the control polygon (see Figure 7.6). A necessary condition for two intersecting

x

y

Figure 7.6 – AABBs of control polygons.

control polygons is clearly the intersection of their respective AABBs. As a consequence, insteadof looking directly for intersecting control polygons, we first look for intersecting AABBs. Thus,the intersecting control polygons detection consists in two steps: we first list all the pairs of inter-secting AABBs and then, in a second time, we check these pairs in order to see if the associatedcontrol polygons intersect. To do so, we directly check the 9 segment-segment intersections of thepolygons (see, e.g., [262, p. 28-30]).

7.3 Application to multiple-inclusion detection

This section focuses on the inverse problem of reconstructing numerically an obstacle ωex livingin a larger bounded domainΩ⊂R2 from boundary measurements on ∂Ω. Our aim is in particularto test the flip procedure introduced before in the case where ωex is a two-component obstacle(see Section 7.3.3). To this aim we will proceed to a standard shape optimization approach, byminimizing a shape cost functional, based on shape derivatives and on a shape gradient descentmethod. I refer to the classical books by Henrot and Pierre [240] and by Sokołowski and Zolé-sio [290] for more details on the techniques of shape differentiability.

102


7.3.1 Problem setting

Let us fix some notations that will be used all along this Section 7.3. We denote by Lp , Wm,p and Hm

the usual Lebesgue and Sobolev spaces. Let Ω be a nonempty bounded and connected open setof R2 with a C1,1-boundary and let g ∈ H3/2(∂Ω) such that g 6= 0. We denote by n the externalunit normal to ∂Ω, and for a smooth enough function u, we denote by ∂nu := ∇u ·n the normalderivative of u. Let 0 < ε< 1 be fixed (small). In the sequel O stands for the set of all open subsetsωstrictly included in Ω, with a C1,1-boundary, such that the distance d(x,∂Ω) from all x ∈ ω to thecompact boundary ∂Ω is strictly greater than ε, and such thatΩ\ω is connected.

We focus on the following inverse problem. Assume that an unknown obstacle ωex ∈ O is lo-cated insideΩ. We consider hereafter the Laplace equation inΩ\ωex with homogeneous Dirichletboundary condition on ∂ωex and nonhomogeneous Dirichlet boundary condition on ∂Ω. Preciselywe denote by uex ∈ H1(Ω\ωex) the unique solution to the problem

−∆uex = 0 in Ω\ωex,uex = g on ∂Ω,uex = 0 on ∂ωex.

(7.1)

Our main purpose is to reconstruct numerically the unknown shape ωex, assuming that a mea-surement is done on the exterior boundary ∂Ω. Precisely we assume that we know exactly thevalue of the measure µb := ∂nuex ∈ H1/2(∂Ω). Thus we are interested in the following problem:

find ω ∈O and u ∈ H1(Ω\ω) which satisfies the overdetermined system−∆u = 0 in Ω\ω,

u = g on ∂Ω,∂nu = µb on ∂Ω,

u = 0 on ∂ω.

(7.2)

The existence of a solution is trivial since we assume that the measurement µb is exact. Further-more the Holmgren’s theorem (see, e.g., [243]) claims that the solution is unique. In order to solveProblem (7.2), we will focus on the shape optimization problem

ω∗ ∈ argminω∈O

J (ω), (7.3)

where J is the nonnegative least squares functional defined by

J (ω) :=∫∂Ω

|∂nuω−µb |2,

where uω ∈ H1(Ω\ω) is the unique solution to the problem−∆uω = 0 in Ω\ω,

uω = g on ∂Ω,uω = 0 on ∂ω.

(7.4)

Finally, in order to solve numerically the shape optimization problem (7.3), we will now computethe shape gradient of the cost functional J and apply a classical gradient descent method.

7.3.2 Computation of the shape gradient

In order to define shape derivatives, we will use the so-called Hadamard method. We first intro-duce the space of admissible deformations given by

U := V ∈ W2,∞(R2,R2) | supp(V ) ⊂Ωε,

103


where Ωε is an open set with a C∞-boundary such that

x ∈Ω | d(x,∂Ω) > ε/2 ⊂Ωε ⊂ x ∈Ω | d(x,∂Ω) > ε/3 .

In particular we are interested in the shape gradient of J defined by

∇J (ω) ·V := limt→0

J((Id+ tV )(ω)

)− J (ω)

t,

for every ω ∈O and every V ∈U .

Proposition 7.6. The least squares functional J is differentiable at anyω ∈O in any direction V ∈U

with

∇J (ω) ·V =−∫∂ω∂nuω∂nwω(V ·n), (7.5)

where wω ∈ H1(Ω\ω) is the unique solution to the adjoint problem given by−∆wω = 0 inΩ\ω,

wω = 2(∂nuω−µb) on ∂Ω,wω = 0 on ∂ω.

(7.6)

From the above explicit formulation of the shape gradient of J , we are now in a position to imple-ment some numerical simulations based on a classical gradient descent method and we includethe flip procedure in order to detect a multiple-component obstacle.

Remark 7.7. The proof of Proposition 7.6 can be found in [B25, Appendix A]. It is widely inspiredfrom the techniques developed in [240, Chapter 5] and is built in several stages:

(i) Let ω ∈O . We introduce

Θ :=θ ∈U | ‖θ‖W2,∞ < ε

3

,

and, for any θ ∈Θ, we consider the unique solution uθ ∈ H1(Ω\ωθ) to the perturbed problem−∆uθ = 0 in Ω\ωθ,

uθ = g on ∂Ω,uθ = 0 on ∂ωθ,

whereωθ := (Id+θ)(ω). Then we introduce vθ := uθ(Id+θ) which is defined on the fixed do-mainΩ\ω. Using the regularity assumptions (such as C1,1-boundaries, g ∈ H3/2(∂Ω) and θ ∈W2,∞(R2,R2)) and the implicit function theorem, one can prove that the map θ ∈ Θ 7→ vθ ∈H2(Ω\ω) is of class C1 in a neighborhood of 0W2,∞ . Finally, since uθ = vθ (Id+ θ)−1, onecan deduce that the map θ ∈ Θ 7→ uθ ∈ H1(K) is differentiable at 0W2,∞ for every compactsubsets K ⊂Ω\ω.

(ii) Now we are aware about the above differentiability results, our aim is to derive explicit ex-pressions of the differentials. To this aim let V ∈U and consider θ = tV with t ≥ 0 sufficientlysmall. We introduce the notations ωt :=ωtV , ut := utV , vt := vtV and j (t ) := J (ωt ). Considerthe shape derivative defined by u′

0 := limt→0ut−u0

t (where the quotient is considered on allcompact subsets included in Ω). From the variational formulation satisfied by ut (on theperturbed domain Ω\ωt ), one can easily obtain that −∆u′

0 = 0 in Ω\ω and u′0 = 0 on ∂Ω. In

the same manner, considering the material derivative defined by u0 := limt→0vt−v0

t , we seethat u0 = 0 on ∂ω. Since moreover the equality u′

0 = u0 −∇u0 ·V is satisfied, we obtain thatu′

0 ∈ H1(Ω\ω) is the unique solution to−∆u′

0 = 0 in Ω\ω,u′

0 = 0 on ∂Ω,u′

0 = −∂nu0(V ·n) on ∂ω.

Note that we have used the tangential/normal decomposition of ∇u0 on ∂ω and the fact thatits tangential part is null (since u0 = 0 on ∂ω).

104


(iii) Finally we obtain by differentiating under the sum sign that j is differentiable at t = 0 with

j ′(0) = 2∫∂Ω∂nu′

0(∂nu0 −µb).

Note that the above expression of j ′(0) depends implicitely on V (in the definition of u′0)

which represents an important numerical cost in view of implementing a gradient descentmethod. In order to avoid this pitfall, we introduce the adjoint vector w ∈ H1(Ω\ω) which isthe unique solution to

−∆w = 0 in Ω\ω,w = 2(∂nu0 −µb) on ∂Ω,w = 0 on ∂ω.

One can easily prove from the variational formulations of u′0 and w (using w as test function

in the variational formulation of u′0, and reversely) that

j ′(0) =−∫∂ω∂nu0 ∂nw(V ·n),

which concludes the proof of Proposition 7.6. Note that this result is in accordance withthe structure theorem [240, Proposition 5.9.1] which asserts that the shape gadrient ∇J (ω)·Vdepends only on the normal part V ·n of the direction V .

7.3.3 Numerical simulations

The numerical simulations presented hereafter are performed using the finite element libraryFREEFEM++ [239]. Note that all figures have been gathered at the end of the chapter.

In what follows the exterior boundary ∂Ω is assumed to be the circle centered at the origin and ofradius 10 and we consider the exterior Dirichlet boundary condition g = 100. In order to get a suit-able measure µb , we use a synthetic data, that is, we fix a shape ωex and solve Problem (7.1) usinga finite element method (here P2 finite element discretization) and extract the measurement µb

by computing ∂nuex on ∂Ω. Then we use a P1 finite element discretization to solve Problems (7.4)and (7.6) with 50 discretization points for both the exterior boundary and each cubic Bézier patchdescribing the shape ω. In order to numerically solve the optimization problem (7.3), we use thefollowing classical gradient descent algorithm and we include the flip procedure at Step 2.

Algorithm (A )

1. Fix k = 0, fix an initial shape ω0 and fix a maximal number M ∈N∗ of iterations;

2. Scan ωk looking for intersecting control polygons:

(a) in the case of no intersecting control polygons, go to Step 3;

(b) in the case of intersecting control polygons, apply the flip procedure and obtain amultiple-component shape ωk ←ω1

k ∪ω2k ;

3. Solve Problems (7.4) and (7.6) with ω=ωk ;

4. Compute the shape gradient ∇J (ωk ) from Formula (7.5);

5. Move the control points of the shape, that is, do ωk+1 ←ωk −αk∇J (ωk ), where αk is a smallpositive coefficient chosen, e.g., by a classical line search;

6. Do k ← k +1 and get back to Step 2 while k < M .

We tested Algorithm (A ) in the following contexts:

105


(i) on the problem of detecting smooth convex objects. Precisely we begin by detecting the cir-cle centered at the origin and of radius 6, and then the ellipse (8cosθ, 5sinθ),θ ∈ [0,2π]using four cubic Bézier patches. Numerical simulations are performed and depicted in Fig-ure 7.7;

(ii) on the problem of detecting a nonsmooth shape and of detecting a non-convex shape (seeFigure 7.8). Precisely we first consider the square of side 10 and centered at the origin andwe use four cubic Bézier patches. As one can see in Figure 7.8(a), each Bézier patch detectsa side of the square. In Figure 7.8(b), we consider the non-convex shape parameterized by(2.8(1.6+cos(3θ))cos(θ),2.8(1.6+cos(3θ))cos(θ)),θ ∈ [0,2π], using six cubic Bézier patches.This shape is also considered in [223, Figure 4] where authors obtained the convex hull of theshape. However, note that the authors used a different method where the descent directionis obtained by solving a boundary value problem involving the kernel of the shape gradient;

(iii) on the problem of detecting a two-component shape starting from a one-component initialshape. We consider two circles of radius 2 centered at (−4,−4) and (4,4). We present differentstates of the algorithm in Figure 7.9. The initial Bézier shape consists in a single componentwith four cubic Bézier patches, located at the center (Figure 7.9(a)). The shape grows andsurrounds the two objects until two control polygons intersect each other (Figure 7.9(b)).The flip procedure is performed and the shape is divided in two connected components(Figure 7.9(c)). At the end, the algorithm provides an approximation of the two obstacles(Figure 7.9(d)). Figure 7.10 depicts the evolution of the objective function during this sim-ulation. One can note a change of behavior after Iteration 133 which corresponds to theperformance of the flip procedure. Precisely, the algorithm finds in a first place a local min-imizer at Iteration 13, which corresponds to a one-component minimizer. After oscillationsaround this local minimum, the flip procedure is performed and the functional decreasesand stabilizes around a two-component minimizer.

7.4 Concluding comments and perspectives

In the paper [B25] we have introduced the flip procedure as a method that enables to divide aone-component shape into a two-component shape. Actually the flip procedure can be easilyadapted in order to perform the reverse operation, that is, to merge a two-component shape intoa one-component shape. I refer to [B25, Appendix B] for details and numerical simulations in thatcontext.

I conclude this chapter with a brief discussion on the challenging extension to the three-dimen-sional case. I refer for instance to [266] where deformation of piecewize Bézier surfaces is pre-sented with an implementation. Note that the adaptation of the complete algorithmic setting ofthe flip procedure to the three-dimensional case would be nontrivial since it would increase thealgorithmic and combinatoric complexities. Numerous considerations about this generalizationcould be addressed in the future.

106


Exterior boundary

Exact shape

Initial shape

Approximated shape

(a) Detection of a circle

Exterior boundary

Exact shape

Initial shape

Approximated shape

(b) Detection of an ellipse

Figure 7.7 – Detection of convex and smooth obstacles.

Exterior boundary

Exact shape

Initial shape

Approximated shape

(a) Detection of a square

Exterior boundary

Exact shape

Initial shape

Approximated shape

(b) Detection of a non-convex shape.

Figure 7.8 – Detection of a nonsmooth obstacle and of a non-convex obstacle.

107


Exterior boundary

Exact shape

Approximated shape

(a) Initial shape

Exterior boundary

Exact shape

Approximated shape

(b) Intersecting control polygons

Exterior boundary

Exact shape

Approximated shape

(c) Flip procedure

Exterior boundary

Exact shape

Approximated shape

(d) Final shape

Figure 7.9 – Detection of a two-component obstacle starting from a one-component shape

0 20 40 60 80 100 120 140 160 180 2000

2000

4000

6000

8000

10000

12000

14000

← Flip

Figure 7.10 – Evolution of the objective function for the detection of a two-component shape.

108

Chapter 8

A prelude to Chapters 9, 10 and 11The present chapter is a prelude to the next Chapters 9, 10 and 11. Here my aim is to tell the genesis of

my works summarized in these three chapters, by emphasizing their common motivation. Indeed these

three different contributions (which may seem disconnected at first sight) have all been initiated in view of

a larger and common research project dealing with shape optimization problems in contact mechanics.

Introduction. The variational inequalities are particularly involved in the modeling of mechan-ical contact problems, such as the taking into account of Signorini unilateral boundary conditionsor of the well known Tresca friction law. A brief introduction to the relationships between contactmechanics and variational inequalities is given in Introduction of Chapter 10. Otherwise I refer theinterested reader to the book [231] by Duvaut and Lions for a complete study. Numerous shapeoptimization problems naturally arise in contact mechanics, as evidenced by the various workssuch as [228, 238, 244, 247, 254, 270, 290, 293] and references therein. However, in that context,the models involve inequality and/or nonsmooth boundary conditions. Hence, the classical toolsof shape optimization theory, in order to compute shape gradients for instance, cannot be easilyextended to these more intricate models. As a consequence, most of the references listed aboveinvestigate regularized (or penalized) mechanical contact problems.

After the work [B25], Samir Adly from the University of Limoges, who is an expert in variationalanalysis, and particularly in variational inequalities, brought to the attention of Fabien Caubet andmyself that it could be possible to investigate shape optimization problems in contact mechanics,precisely involving the Tresca friction law, without using any regularization method (in order tokeep the intrinsic nonsmooth nature of the model), but rather by using directly the tools of thevariational analysis community. This research project, borned in 2015 and still in progress in 2020,has been the breeding ground of the four following papers:

• [B26] S. Adly and L. Bourdin. Sensitivity analysis of variational inequalities via twice epi-differentiability and proto-differentiability of the proximity operator. SIAM J. Optim., 28(2):1699–1725, 2018.

• [B27] S. Adly, L. Bourdin, and F. Caubet. On a decomposition formula for the proximal oper-ator of the sum of two convex functions. J. Convex Anal., 26(3):699–718, 2019.

• [B28] S. Adly and L. Bourdin. On a decomposition formula for the resolvent operator ofthe sum of two set-valued maps with monotonicity assumptions. Appl. Math. Optim.,80(3):715–732, 2019.

• [B29] S. Adly, L. Bourdin, and F. Caubet. The derivative of a parameterized mechanical con-tact problem with a Tresca’s friction law involves Signorini unilateral conditions. Submitted,2020.

Indeed our first investigations led us to divide our global research project into several smallermathematical challenges. The aim of the present chapter is to expound in a simple way the abovedivision of our work. Then the next Chapters 9, 10 and 11 will summarize the contributions ofeach of the papers listed above.

109

CHAPTER 8. A PRELUDE TO CHAPTERS 9, 10 AND 11

A scalar version of the Tresca friction model. In what follows we consider the scalar problemgiven by: find u ∈ H1(Ω) such that

−∆u +u = f in Ω,

u = 0 on ΓD,

|∂nu| ≤ g and |u|∂nu +ug = 0 on ΓT,

(8.1)

where Ω⊂ Rd is an open bounded and connected domain with a Lipschitz boundary Γ := ∂Ω de-composed into two measurable disjoint parts Γ := ΓD ∪ΓT and where f ∈ L2(Ω) is the source term.We consider a homogeneous Dirichlet boundary condition on the boundary part ΓD and a Tresca-type friction law on the boundary part ΓT, where g ∈ L2(ΓT) stands for the nonnegative frictionthreshold. Note that Problem (8.1) corresponds to the scalar version of the usual vectorial Trescafriction model considered in the literature (see, e.g., [225, Section 2.1] for a handy presentation ofthe vectorial Tresca friction model).

The variational formulation of Problem (8.1) leads to the variational inequality given by: find u ∈K such that ∫

Ωu(ϕ−u)+

∫Ω∇u ·∇(ϕ−u)+

∫ΓT

g (|ϕ|− |u|) ≥∫Ω

f (ϕ−u), (8.2)

for all test functions ϕ ∈K , where K ⊂ H1(Ω) is the nonempty closed subspace given by

K := ϕ ∈ H1(Ω) |ϕ= 0 on ΓD.

In that context, using the tools of the convex analysis community, it can be proved that Prob-lem (8.2) has a unique solution given by

u = proxιK +Φ(F ), (8.3)

where ιK : H1(Ω) → R∪ +∞ is the indicator function of K , defined by ιK (ϕ) := 0 if ϕ ∈ K

and ιK (ϕ) :=+∞ otherwise, where Φ : H1(Ω) →R is the Tresca friction functional defined by

Φ : H1(Ω) −→ R

ϕ 7−→∫ΓT

g |ϕ|,

where proxιK +Φ : H1(Ω) → H1(Ω) stands for the proximal operator (also known as proximity oper-ator, see the paragraph below for more details) of the sum ιK +Φ, and finally where F ∈ H1(Ω) isthe unique solution to the variational formulation of the basic Neumann problem −∆F +F = f in Ω,

∂nF = 0 on Γ.

A first mathematical challenge. In the convex analysis literature, when considering a generalreal Hilbert space H, the set of proper closed convex functions h : H →R∪+∞ is usually denotedbyΓ0(H). The proximal operator of a given function h ∈ Γ0(H), introduced by Moreau [258] in 1965,is defined by proxh := (Id+∂h)−1 where ∂h : H⇒H stands for the classical subdifferential operatorof h. I refer the reader to Chapter 9 or to standard books on convex analysis such as [213, 242] formore details.

The proximal operator finds numerous applications in the field of convex optimization since onecan easily see that the set of minimizers of a given function h ∈ Γ0(H) exactly coincides with theset of fixed points of proxh . A wide literature is then concerned with fixed-point algorithms ofthe form xk+1 = proxh(xk ), usually called proximal algorithms and introduced by Martinet [253]in 1972 and deeply studied by Rockafellar [273] in 1976. When dealing with a sum h = h1 +h2,

110


where h1, h2 ∈ Γ0(H), it turns out that proxh1+h26= proxh1

proxh2in general (see, e.g., [294, Exam-

ple 2] for a counterexample). Nonetheless a comprehensive literature is dedicated to alternativealgorithms in order to find the fixed points of proxh1+h2

with the only knowledge of proxh1and

proxh2. I refer for example to the well known Douglas–Rachford algorithm introduced by Douglas

and Rachford [230] in 1956 in the particular case of a linear parabolic heat equation and extendedby Lions and Mercier [251] in 1979 to a general setting.

However, in the previous paragraph, we obtained Formula (8.3) in which it is not question offixed points of the proximal operator proxιK +Φ, but of the computation of proxιK +Φ at a givenpoint F ∈ H1(Ω). My first collaboration [B27] with Adly and Caubet was concerned with a decom-position formula for the proximal operator of the sum of two functions, and with the obtention ofan algorithm allowing to compute it. Later, in my collaboration [B28] with Adly, we have extendedthe results obtained in [B27] to the case of the resolvent operator (which can be seen as a general-ization of the proximal operator) of a sum of two set-valued maps. The next Chapter 9 summarizesthe contributions of the two works [B27, B28].

A second mathematical challenge. Let us consider again Problem (8.1) and its variational for-mulation (8.2). In what follows we assume for simplicity that ΓD = ;, and thus Γ = ΓT and K =H1(Ω). In that situation Formula (8.3) becomes simply

u = proxΦ(F ).

As developed in Remark 7.7 of Chapter 7, when dealing with shape optimization problems, and inparticular with the computation of shape gradients, one aims to compute the shape derivative ofthe state u. To this aim we are led to introduce the perturbed domain Ωt := (Id+ tV )(Ω) for smallvalues of t ≥ 0 where V ∈ W1,∞(Rd ,Rd ), and to introduce ut ∈ H1(Ωt ) the unique solution to theperturbed problem −∆ut +ut = f in Ωt ,

|∂nut | ≤ g and |ut |∂nut +ut g = 0 on Γt ,

where Γt := ∂Ωt . In order to compute the shape derivative of u in the direction V , we introduce thematerial derivative of u in the direction V defined by u0 := limt→0

vt−v0t where vt := ut (Id+ tV ) is

defined on the fixed domain Ω. The variational formulation of the perturbed problem is given by:find ut ∈ H1(Ωt ) such that∫

Ωt

ut (ϕt −ut )+∫Ωt

∇ut ·∇(ϕt −ut )+∫Γt

g (|ϕt |− |ut |) ≥∫Ωt

f (ϕt −ut ),

for all test functions ϕt ∈ H1(Ωt ). Using the change of variable y = (Id+ tV )(x) in the above inte-grals, we are able to provide a variational formulation for the function vt with integrals defined onthe fixed domain Ω and on the fixed boundary Γ and with test functions ϕ ∈ H1(Ω). Neverthelessthe computations make intervene several perturbation terms emerging from the changes of vari-able (such as the determinants of the corresponding Jacobian matrices for example) and it is notmy aim in this paragraph to provide all the details of this ongoing work. Roughly speaking, fromthis variational formulation for vt , we are able to obtain that

vt = proxΦt(Ft ), (8.4)

where Ft is a perturbation term of F defined on the fixed domainΩ and where

Φt : H1(Ω) −→ R

ϕ 7−→∫Γ

g t |ϕ|,

where g t is a perturbation term of g defined on the fixed boundary Γ. Finally the following naturalquestion emerges: are we able from Formula (8.4) to provide an expression of the derivative of vt

at t = 0 in function of the derivatives of Ft and g t at t = 0?

111


Before addressing this precise question, Adly, Caubet and myself decided to investigate a simplermathematical question by removing the t-dependence of the functional Φt . Precisely, in our pa-per [B29], we decided to investigate the sensitivity analysis of the parameterized Tresca frictionproblem (on the fixed domainΩ) given by: find ut ∈ H1(Ω) such that −∆ut +ut = ft in Ω,

|∂nut | ≤ 1 and |ut |∂nut +ut = 0 on Γ,(8.5)

where ft is a parameterized source term and we took g ≡ 1 for simplicity. In that context, theunique solution to the variational formulation of Problem (8.5) is given by

ut = proxΦ(Ft ), (8.6)

where Ft ∈ H1(Ω) is the unique solution to the variational formulation of the parameterized Neu-mann problem given by −∆Ft +Ft = ft in Ω,

∂nFt = 0 on Γ.(8.7)

Investigating the twice epi-differentiability (notion of generalized second-order derivative intro-duced and thoroughly studied by Rockafellar in [274, 276, 278]) of the Tresca friction functional Φ(with g ≡ 1), we were able in [B29] to prove that the derivative of ut at t = 0 is the unique solution tothe variational formulation of a Signorini-type problem involving the derivative of ft at t = 0. Forall details, I refer to Chapter 10 which is dedicated to the contributions of the collaboration [B29].

A third mathematical challenge. As we have proved in [B29], the notion of twice epi-differ-entiability is well suited in order to investigate the differentiability of ut at t = 0 from Formula (8.6).Unfortunately this notion has not been extended in the literature to the case of a t-dependentfunctional Φt , while this situation occurs in Formula (8.4). Thus, in the paper [B26] in collab-oration with Adly, we have extended the notion of twice epi-differentiability to the case of a t-dependent functional Φt . This extension is not trivial due to specific difficulties. For all details, Irefer to Chapter 11 which is dedicated to the contributions of the paper [B26] written jointly withAdly.

I take this occasion to mention that these contributions reveal to be suitable in order to investigatethe sensitivity analysis of the parameterized Tresca friction problem (on the fixed domainΩ) givenby: find ut ∈ H1(Ω) such that −∆ut +ut = ft in Ω,

|∂nut | ≤ g t and |ut |∂nut +ut g t = 0 on Γ,(8.8)

where ft is a parameterized source term and g t is a parameterized friction threshold. In that con-text, the unique solution to the variational formulation of Problem (8.8) is given by

ut = proxΦt(Ft ),

where Ft ∈ H1(Ω) is the unique solution to the variational formulation of Problem (8.7) and wherethe parameterized functional Φt is defined by

Φt : H1(Ω) −→ R

ϕ 7−→∫Γ

g t |ϕ|.

I refer to Section 11.4.2 in Chapter 11 for details.

A research project still in progress. The research project presented in this chapter was bornedin 2015 and was the breeding ground of several collaborations between Adly, Caubet and myself.All three together are still advancing on this long-term work. In particular, note that a PhD thesisat the University of Pau, supervised by Caubet and myself, will start in September 2020 in order topursue our investigations on that theme.

112

Chapter 9

On a decomposition formula for theresolvent operator of the sum of twoset-valued maps with monotonicityassumptions


9.2 Main results in [B28] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

9.2.1 Notation and basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

9.2.2 The A-resolvent operator of B and a decomposition formula . . . . . . . . . 116

9.2.3 A weakly convergent algorithm that computes JAB . . . . . . . . . . . . . . . . 117

9.3 Basic application in elliptic PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

9.4 Some comments on the earlier work [B27] . . . . . . . . . . . . . . . . . . . . . . . 121


• [B27] S. Adly, L. Bourdin, and F. Caubet. On a decomposition formula for the proximal oper-ator of the sum of two convex functions. J. Convex Anal., 26(3):699–718, 2019.

• [B28] S. Adly and L. Bourdin. On a decomposition formula for the resolvent operator ofthe sum of two set-valued maps with monotonicity assumptions. Appl. Math. Optim.,80(3):715–732, 2019.

113

CHAPTER 9. ON A DECOMPOSITION FORMULA FOR THE RESOLVENT OPERATOR OF THESUM OF TWO SET-VALUED MAPS WITH MONOTONICITY ASSUMPTIONS

9.1 Introduction

Let H be a real Hilbert space. Finding the zeros of a set-valued map T : H ⇒ H is a topic of majorimportance in applied mathematics, particularly in optimization theory and in nonlinear func-tional analysis. Solving the corresponding inclusion 0H ∈ T (x) is clearly equivalent to finding thefixed points of the associated resolvent operator JT := (I+T )−1. Therefore the fixed-point algo-rithm xk+1 ∈ JT (xk ) is naturally considered in order to approach numerically a zero of T . The classof maximal monotone operators is known to be convenient for such algorithms. Indeed, if T ismaximal monotone, then the resolvent operator JT enjoys several nice mathematical propertiessuch as single-valuedness and firm nonexpansiveness. In that case the above sequence (xk )k∈N isknown to be weakly convergent to a zero of T . This is the spirit of the proximal algorithm intro-duced in 1972 by Martinet [253] and thoroughly studied in 1976 by Rockafellar [273]. Recall thatthe term proximal (or proximity) was coined in 1965 by Moreau [258] who introduced the so-calledproximal operator prox f , associated to a proper closed convex function f : H → R∪ +∞, whichcorresponds to the resolvent operator of the subdifferential operator ∂ f : H ⇒ H of f (which isknown to be maximal monotone, see [272]), that is, prox f = J∂ f . I refer to Section 9.2.1 for morebasics and notations.

An important number of contributions has been devoted to efficient algorithms for solving in-clusions of the form 0H ∈ T (x). Indeed, since the resolvent operator JT is not easy to computein general, the above proximal algorithm cannot be implemented in most of cases. The splittingmethods consist in decomposing the operator T = A +B as the sum of two (or more) operators Aand B for which the associated resolvent operators JA and JB can be computed. As an example, if Aand B are maximal monotone, the Douglas–Rachford algorithm, which is one of the most popularsplitting methods, allows to compute a numerical approximation of a zero of T = A +B with theonly knowledge of JA and JB (and without the knowledge of JT = JA+B ). It was first introduced in1956 by Douglas and Rachford [230] in order to solve numerically a linear parabolic heat equationby using a finite difference method. In 1979, Lions and Mercier [251] extended this algorithm tothe above general framework with application to the classical obstacle problem in PDEs.

Although splitting methods have a great success in the literature, it has to be clarified that noclosed expression of JA+B depending on JA and JB has been discovered yet. This question stillremains an open challenge. In particular note that the Douglas–Rachford algorithm is not basedon the computation of JT = JA+B (and it does not compute it either). In my collaboration [B28]with Adly, our main objective was to provide an explicit decomposition formula of JA+B . For thispurpose we introduced a new operator, called the A-resolvent operator of B and denoted by JA

B ,which generalizes the usual notion (see Definition 9.1 for details). Our major contribution in [B28]is the decomposition formula JA+B = JA JA

B holding true when A is monotone (see Theorem 9.2).The difficulty to compute JT = JA+B is thus transmitted to the computation of the new operator JA

B .For this purpose, a relationship between JA

B and an extended version of the Douglas–Rachfordoperator has been provided in [B28, Section 4]. This allowed us to propose a weakly convergentalgorithm that computes numerically JA

B , and thus JA+B from the decomposition formula, when Aand B are maximal monotone (see Theorem 9.13). Finally, following the spirit of the foundingpaper [251] by Lions and Mercier, Section 9.3 is devoted to an application of our theoretical resultsin elliptic PDEs. Precisely, the decomposition formula is used in order to point out the relationshipbetween the classical obstacle problem and a new nonlinear elliptic PDE where the diffusion onlyoccurs on the nonnegative part of the unknown function (the elliptic operator is said to be partiallyblinded). Some numerical experiments, using the finite element library FREEFEM++ [239], arecarried out for illustration.

To conclude this introduction, I mention that the work [B28] is a continuation of the previouspaper [B27] in collaboration with Adly and Caubet. Indeed a decomposition formula was earlierprovided in [B27] for the proximal operator prox f +g of the sum of two proper closed convex func-tions f , g : H →R∪+∞. Precisely, under the qualification condition ∂( f +g ) = ∂ f +∂g , we proved

in [B27, Theorem 2.8] that the decomposition formula prox f +g = prox f prox fg holds true, which

114


exactly corresponds to the equality JA+B = JA JAB in the particular case where A = ∂ f and B = ∂g .

Hence the results obtained in the second paper [B28] are more general since the decompositionformula handles general set-valued maps A and B , which are not necessarily subdifferential mapsof proper closed convex functions. In addition this formula requires only the monotonicity of A(no qualification condition is used). Nevertheless I devote the last Section 9.4 of the present chap-ter to specific comments on the contributions of the first work [B27].

9.2 Main results in [B28]

This section is dedicated to the main contributions of the paper [B28] written jointly with Adly. InSection 9.2.1, we start with some notations available throughout the chapter and we recall somebasics from the set-valued operators theory. In Section 9.2.2, the definition of the A-resolvent op-erator of B , denoted by JA

B , is recalled. Then we state the decomposition formula in Theorem 9.2and a list of general comments is in order. Finally, in Section 9.2.3, the relationship between JA

Band an extended version of the Douglas–Rachford operator is provided. This result allows us topropose a weakly convergent algorithm that computes numerically JA

B , and thus JA+B from thedecomposition formula, when A and B are maximal monotone

9.2.1 Notation and basics

For this section, I refer the reader to standard references and monographs like [212, 216, 219, 257,267, 271] and references therein. Let H be a real Hilbert space and let ⟨·, ·⟩ be the correspondingscalar product. In this chapter I : H → H denotes the standard identity map and, for any pair (S1,S2)of subsets of H, we denote by

S1 +S2 := x1 +x2 | x1 ∈ S1, x2 ∈ S2.

Sum and composition of set-valued maps. The domain and the graph of a set-valued map A :H⇒H are respectively given by

D(A) := x ∈ H | A(x) 6= ; and Gr(A) := (x, y) ∈ H×H | y ∈ A(x).

We denote by A−1 : H⇒H the set-valued operator defined by

A−1(y) := x ∈ H | y ∈ A(x),

for all y ∈ H. For all x, y ∈ H, note that y ∈ A(x) if and only if x ∈ A−1(y). The range of A is given by

R(A) := y ∈ H | A−1(y) 6= ; = D(A−1).

The sets of fixed points and zeros of A are respectively given by

Fix(A) := x ∈ H | x ∈ A(x) and Zer(A) := A−1(0H).

Finally, if A(x) is a singleton for all x ∈ D(A), we say that A is single-valued over D(A) and we simplywrite A : D(A) → H, identifying the restriction of A on D(A) to the corresponding standard map.

Let A, B : H ⇒ H be two set-valued maps. In this paper we write that A ⊂ B when A(x) ⊂ B(x) forall x ∈ H. Note that, if A ⊂ B , then A−1 ⊂ B−1. Finally, the sum A +B : H ⇒ H and the composi-tion B A : H⇒H are respectively defined by

(A+B)(x) := A(x)+B(x) and (B A)(x) := ⋃y∈A(x)

B(y),

for all x ∈ H.

115


Monotone operators and resolvent operators. Let A : H ⇒ H be a set-valued operator. We saythat A is monotone if ⟨y2 − y1, x2 − x1⟩ ≥ 0 for all (x1, y1), (x2, y2) ∈ Gr(A). Note that A is monotoneif and only if A−1 is monotone. Moreover A is said to be maximal monotone if it is monotone andthe inclusion A ⊂ B , with B monotone, implies that A = B . In this chapter we denote by M (H) theset of all maximal monotone operators on H. Note that A ∈M (H) if and only if A−1 ∈M (H). Let usrecall the following characterization due to Minty [256] (see also [288, Theorem 1.2]): a monotoneset-valued operator A : H⇒H belongs to M (H) if and only if R(I+ A) = H.

The resolvent operator of a general set-valued operator A : H⇒H is the set-valued map JA : H⇒Hdefined by

JA := (I+ A)−1.

If A is monotone, one can easily see that JA : D(JA) → H is single-valued over D(JA), and that A ∈M (H) if and only if D(JA) = H.

Basics of convex analysis. As usual in the literature, we denote by Γ0(H) the set of all properclosed convex functions f : H →R∪ +∞. For all f ∈ Γ0(H), we denote by dom( f ) the domain of fand by ∂ f : H⇒H its subdifferential operator defined by

∂ f (x) := y ∈ H | ∀z ∈ K, ⟨y, z −x⟩ ≤ f (z)− f (x),

for all x ∈ H. In that context it is well known that ∂ f ∈ M (H) (see [272]) and that the resolventoperator associated to ∂ f exactly coincides with the proximal operator of f introduced in 1965 byMoreau [258], that is, J∂ f = prox f .

If K ⊂ H is a nonempty closed convex subset of H, we denote by ιK : H → R∪ +∞ the indicatorfunction of K, defined by ιK(x) := 0 if x ∈ K and ιK(x) := +∞ otherwise. Recall that ιK ∈ Γ0(H) andits subdifferential exactly coincides with the normal cone to K, that is, ∂ιK = NK. Moreover theproximal operator of ιK exactly coincides with the projection operator onto K, that is, proxιK = projK.

9.2.2 The A-resolvent operator of B and a decomposition formula

In this section I first recall in Definition 9.1 the notion of A-resolvent operator of B originally in-troduced in the paper [B28]. Several examples have been provided in [B28, Examples 1 to 4] butthey will not be recalled in the present chapter. The decomposition formula, which constitutes themajor contribution of the paper [B28], is recalled in Theorem 9.2. Then a list of general commentsis in order.

Definition 9.1 (A-resolvent operator of B). Let A, B : H ⇒ H be two set-valued operators. Theset-valued map JA

B : H⇒H, defined by

JAB := (I+B JA)−1 = JBJA ,

is called the A-resolvent operator of B .

Theorem 9.2 (Decomposition formula). Let A, B : H ⇒ H be two set-valued operators. It holdsthat D(JA

B ) = D(JA+B ) and JA+B ⊂ JAJAB . Moreover, if A is monotone, then the decomposition formula

JA+B = JA JAB ,

holds true.

Remark 9.3. Theorem 9.2 can be found in [B28, Theorem 2]. Its proof is simple, being essentiallybased on basic computations.

116


Remark 9.4. Let A, B : H ⇒ H be two set-valued operators. If A is the null operator (for instance),it holds that JA

B = JB . As a consequence JAB can be seen as a generalization of JB , or as a perturbation

of JB by the operator A. Note that [B28, Example 1] provides a simple situation where JAB = JB

while A is not the null operator. Actually, one can easily see that if B JA ⊂ B (resp. B ⊂ B JA),then JA

B ⊂ JB (resp. JB ⊂ JAB ). In particular, if B JA = B , then JA

B = JB . In that context, if moreover Ais monotone, note that the decomposition JA+B = JA JB holds true from Theorem 9.2. Howeverlet me recall that this suitable formula JA+B = JA JB does not hold true in general (see, e.g., [294,Example 2] for a counterexample).

Remark 9.5. The inclusion JA+B ⊂ JA JAB in Theorem 9.2 might be strict if A is not monotone. I

refer to [B28, Example 3] for an example.

Remark 9.6. Let A, B : H ⇒ H be two set-valued operators with A and A +B monotone. In par-ticular JA+B : D(JA+B ) → H is single-valued. Let x ∈ D(JA

B ) = D(JA+B ). Theorem 9.2 states that, evenif JA

B (x) is not a singleton (see Example 9.11 below for an example), then all elements in JAB (x) have

the same value through the resolvent operator JA , and this value is equal to JA+B (x).

Remark 9.7. It is possible that JAB is not single-valued over D(JA

B ), even if A, B and A +B belongto M (H) (see Example 9.11 below for an example). In such context, it follows that B JA is notmonotone and more generally that JA

B = JBJA cannot be written as the resolvent operator JC ofsome monotone operator C . In [B28, Proposition 2], we were able from Theorem 9.2 to prove that,if A and A +B are monotone, then the single-valuedness of A over D(A) (or of B over D(B)) is asufficient condition which guarantees that JA

B is single-valued over D(JAB ).

Remark 9.8. Let A, B : H⇒H be two set-valued operators such that A+B is monotone. From theMinty theorem and Theorem 9.2, note that A+B ∈M (H) if and only if D(JA

B ) = H.

Remark 9.9. Let A, B : H ⇒ H be two set-valued operators. From Theorem 9.2 and since A and Bplay symmetric roles, note that R(JA+B ) ⊂ R(JA)∩R(JB ).

Remark 9.10. Let A, B : H⇒H be two set-valued operators. If JB ⊂ JAB (see Example 9.11 below for

an example), we are in the situation where JB is a selection of JAB . Using Theorem 9.2, we were able

in [B28, Proposition 3] to specify this selection in the case where A, B ∈ M (H). Precisely, in thatsituation, we first proved that JA

B (x) is a nonempty closed convex subset of H for all x ∈ D(JAB ) and

secondly, we proved that, if JB (x) ∈ JAB (x) for some x ∈ H, then

JB (x) = projJAB (x)(JA+B (x)).

Example 9.11. Let H = R and let A = NR+ and B = NR− . It holds that A +B = N0. In particularwe have A, B and A +B ∈M (R). One can easily compute that JA

B (x) = (−∞,min(x,0)] for all x ∈ R.Moreover we have B ⊂ B JA and thus JB ⊂ JA

B . Since JA+B (x) = 0 for all x ∈ R, we get from theprevious remark that

JB (x) = projJAB (x)(0),

for all x ∈ R. We deduce that JB (x) is the particular selection that corresponds to the element ofminimal norm in JA

B (x) (also known as the lazy selection). This lazy selection is illustrated by thegraphs of JB = projR− and JA

B provided in Figure 9.1. This example is extracted from [B28, Exam-ple 4].

9.2.3 A weakly convergent algorithm that computes JAB

Let A, B ∈ M (H). The first objective of this section is to expose the existing relationship betweenthe A-resolvent operator JA

B of B and the classical Douglas–Rachford operator TA,B : H → H intro-duced by Lions and Mercier [251] and defined by

TA,B (x) := x − JA(x)+ JB (2JA(x)−x),

117


0 JB

JAB

Figure 9.1 – Example 9.11, graph of JB = projR− in bold line, and graph of JAB in gray.

for all x ∈ H. For this purpose we introduce the extended version T A,B : H×H → H of TA,B definedby

T A,B (x, y) := y − JA(y)+ JB (x + JA(y)− y),

for all x, y ∈ H. Note that TA,B (x) = T A,B (JA(x), x) for all x ∈ H, and that the definition of T A,B

only depends on the knowledge of JA and JB . The relationship between JAB and T A,B is exposed in

the next proposition.

Proposition 9.12. Let A, B ∈M (H). It holds that

JAB (x) = Fix(T A,B (x, ·)),

for all x ∈ H.

Proposition 9.12 and its proof can be found in [B28, Lemma 4]. From this result we were ablein [B28, Theorem 3] to propose an algorithm, that depends only on the knowledge of JA and JB ,allowing to compute numerically an element of JA

B (x) for all x ∈ D(JAB ). This is the content of the

next theorem.

Theorem 9.13. Let A, B ∈M (H) and let x ∈ D(JAB ) be fixed. Then Algorithm (A ) given by

y0 ∈ H,

∀k ∈N, yk+1 =T A,B (x, yk ),(A )

weakly converges to an element y∗ ∈ JAB (x). Moreover it holds that JA(y∗) ∈ JA+B (x).

Remark 9.14. Theorem 9.13 can be found in [B28, Theorem 3]. Its proof is based on Proposi-tion 9.12 and Theorem 9.2, but also on a technical writting of T A,B (x, ·) as the composition oftwo firmly nonexpansive maps and then on the application of [213, Theorem 5.23]. Actually notethat [213, Theorem 5.23] even allows to conclude in Theorem 9.13 that the sequence (JA(yk ))k∈Nweakly converges to JA(y∗) ∈ JA+B (x).

Remark 9.15. For at least the particular case where A and B are subdifferential operators of properclosed convex functions, Algorithm (A ) was already considered, up to some translations, and im-plemented in previous works (see, e.g., the so-called dual forward-backward splitting in [226, Al-gorithm 3.5]). This fact points out that the operator JA

B is already present (in a hidden form) anduseful for numerical purposes in the existing literature. However, to the best of my knowledge,it has never been explicitly expressed in a closed formula (such as in Definition 9.1) and neitherbeen studied from a theoretical point of view.

118


Remark 9.16. In the present chapter our point of view is based on the decomposition formulaprovided in Theorem 9.2. Note that different approaches that compute numerically the resolventoperator of the sum of two (or more) maximal monotone operators can be found in the literature.Let us mention for example the averaged alternating modified reflections algorithm (AAMR) split-ting method which was used in 2019 in the paper [210]. Note that the authors proved that thealgorithm is moreover strongly convergent.

9.3 Basic application in elliptic PDEs

Following the spirit of the founding paper [251] by Lions and Mercier, this section is devoted toan application of our theoretical results in elliptic PDEs. Let Ω ⊂ Rd be a nonempty open subsetof Rd , where d ∈N∗ is a fixed positive integer. Let H := L2(Ω) stand for the usual Lebesgue space ofreal square-integrable functions defined on Ω and fix some f ∈ L2(Ω). In what follows ∆ denotesthe standard Laplace diffusion operator (understood in the usual distributional sense) and weintroduce the two sets

L2+(Ω) := ϕ ∈ L2(Ω) |ϕ≥ 0 a.e. on Ω and L2

∆(Ω) := ϕ ∈ L2(Ω) |∆ϕ ∈ L2(Ω).

Finally, for all ϕ ∈ L2(Ω), we denote by ϕ+ := max(ϕ,0) (resp. ϕ− := min(ϕ,0)) the standard non-negative (resp. nonpositive) part of ϕ.

With no boundary condition. In this paragraph we focus on the nonlinear elliptic partial differ-ential equation problem given by

find u ∈ L2(Ω) such that u+ ∈ L2∆(Ω) and −∆(u+)+u = f , (ELL)

and on the classical obstacle problem given by

find v ∈ L2(Ω) such that v ∈ L2∆(Ω) and 0 ≤ (−∆v + v − f ) ⊥ v ≥ 0. (OP)

In Problem (ELL) we say that the Laplace operator is partially blinded in the sense that the diffu-sion only occurs on the nonnegative part of the unknown u. In Problem (OP) the complementarycondition 0 ≤ (−∆v + v − f ) ⊥ v ≥ 0 has to be understood as follows:

v ≥ 0, −∆v + v − f ≥ 0 and (−∆v + v − f )v = 0, a.e. on Ω.

We say that f is admissible for Problem (ELL) (resp. Problem (OP)) if Problem (ELL) (resp. Prob-lem (OP)) admits at least one solution. From Theorem 9.2, we were able in [B28, Proposition 4] toprove the next proposition.

Proposition 9.17. It holds that f ∈ L2(Ω) is admissible for Problem (ELL) if and only if it is admis-sible for Problem (OP). In that case, the set of solutions to Problem (OP) is given by

u+ | u is a solution to Problem (ELL).

With a homogeneous Dirichlet boundary condition. In this paragraph we assume moreoverthat Ω is a nonempty bounded Lipschitz domain of Rd and we use the usual Sobolev space H1

0(Ω)(see classical textbooks such as [220, 233] for details). In this paragraph we consider the nonlinearboundary value problem given by

find u ∈ L2(Ω) such that u+ ∈ H10(Ω)∩L2

∆(Ω) and −∆(u+)+u = f , (ELL0)

and the obstacle problem with boundary value given by

find v ∈ L2(Ω) such that v ∈ H10(Ω)∩L2

∆(Ω) and 0 ≤ (−∆v + v − f ) ⊥ v ≥ 0. (OP0)

119


In this paragraph we will also consider the very well known basic linear boundary value problemgiven by

find w ∈ L2(Ω) such that w ∈ H10(Ω)∩L2

∆(Ω) and −∆w +w = f , (LP0)

which admits a unique solution (see, e.g., [220, Theorem 9.21]). Clearly Problem (LP0) dependson f . In the next result, this dependence will be denoted by (LP0( f )).

Proposition 9.18. Problems (ELL0) and (OP0) admit each a unique solution, denoted respectivelyby u and v. Moreover it holds that v = u+. Finally the algorithm given by

z0 ∈ L2(Ω),

∀k ∈N, zk+1 = z−k +ξk , where ξk is the solution to Problem (LP0( f − z−

k )),(A0)

weakly converges in L2(Ω) to u.

Remark 9.19. Proposition 9.18 can be found in [B28, Proposition 5]. Its proof is based on The-orems 9.2 and 9.13, but also on the application of the Riesz–Fréchet representation theorem, theStampacchia theorem and some regularity results.

We conclude this section by a numerical illustration of Algorithm (A0) in the two-dimensionalcase d = 2 with Ω the open disk of center 0R2 and of radius 3π

2 and with the source term f (x, y) :=xe−x2−y2

for all (x, y) ∈ Ω. The numerical simulations presented hereafter in Figure 9.2 are ex-tracted from [B28, Section 5] and performed using the finite element library FREEFEM++ [239](with P1 finite element discretization) and MATLAB software for the 3D graphical representations.

(a) A plot of the data f (x, y) := xe−x2−y2(b) A plot of the solution u to Problem (ELL0)computed from Algorithm (A0)

(c) A plot of the solution v to Problem (OP0) byapplying v = u+

Figure 9.2 – Numerical simulations of Section 9.3

120


9.4 Some comments on the earlier work [B27]

This last section is dedicated to comments on the earlier publication [B27] written in collaborationwith Adly and Caubet. In what follows consider f , g ∈ Γ0(H).

First comment. Our main objective in [B27] was to provide a decomposition formula for prox f +g .

To this aim we introduced the notion of f -proximal operator of g , denoted by prox fg , which ex-

actly corresponds to the ∂ f -resolvent operator of ∂g , that is, prox fg = J∂ f

∂g . Under the two con-ditions dom( f ) ∩ dom(g ) 6= ; and ∂( f + g ) = ∂ f + ∂g (this last condition is well studied in theliterature, see [213, Corollary 16.48] for the Moreau–Rockafellar theorem), we proved in [B27, The-orem 2.8] that

prox f +g = prox f prox fg .

From the more recent work [B28] summarized in the present chapter, we are actually able to assertthat

J∂ f +∂g = prox f prox fg ,

without any condition. In the framework of [B27, Theorem 2.8], note that the condition dom( f )∩dom(g ) 6= ; ensures that f +g ∈ Γ0(H) and that the condition ∂( f +g ) = ∂ f +∂g ensures that J∂ f +∂g =J∂( f +g ) = prox f +g .

Second comment. It is well known that obtaining a theoretical formula for prox f +g which de-pends on prox f and proxg is an open mathematical challenge. We gave in [B27, Appendix A] amore precise description of this difficulty, claiming that there is no closed formula, independentof f and g , allowing to write prox f +g as a linear combination of compositions of linear combina-

tions of I, prox f , proxg , prox−1f and prox−1

g .

Third comment. Like the standard proximal operator, we proved in [B27, Proposition 3.2] thatthe f -proximal operator of g can be related to a variational inequality in one hand, and to a con-vex minimization problem in the other hand. Moreover, taking A = ∂ f and B = ∂g , we emphazisedin [B27, Remark 3.6] that Algorithm (A ) corresponds to the well known forward-backward algo-rithm (see, e.g., [227, Section 10.3]) applied to this convex minimization problem.

Fourth comment. In [B27, Section 4.2] we turned back to our initial motivation presented in theprelude Chapter 8. Precisely, under some assumptions (see [B27, Proposition 4.3] for details), wederived from the decomposition formula that if

u(t ) := proxιK+g (r (t )),

for all t ≥ 0, where K ⊂ H is a nonempty closed convex subset and where g ∈ Γ0(H) and r : R+ → Hare smooth enough, then

u′(0) = proxιC+ψg(r ′(0)),

where C is a nonempty closed convex subset of H (related to K, via the works of Haraux [237] andMignot [255]) and whereψg (x) := 1

2 ⟨D2g (u(0))(x), x⟩ for all x ∈ H. However it should be mentionedthat the assumptions of [B27, Proposition 4.3] are restrictive, raising open questions about theirrelaxations. In particular it is mentioned in [B27, Remark 4.5] that the twice epi-differentiability,which is a notion introduced and thoroughly studied by Rockafellar [274, 276, 278], is a promisingidea in order to obtain more general and deeper results in that direction. This remark drove us toinvest this notion, leading to the redaction of the two works [B29, B26] presented respectively inthe next Chapters 10 and 11.

121


122

Chapter 10

The derivative of a parameterizedmechanical contact problem with aTresca friction law involves Signoriniunilateral conditions


10.2 Basics on Mosco epi-convergence and twice epi-differentiability . . . . . . . . . . 126

10.2.1 Mosco epi-convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

10.2.2 Twice epi-differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

10.3 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

10.3.1 Functional setting and basic results . . . . . . . . . . . . . . . . . . . . . . . . 128

10.3.2 A general Signorini problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

10.3.3 A general Tresca friction problem and the Tresca friction functional . . . . . 129

10.3.4 The derivative of a parameterized Tresca friction problem . . . . . . . . . . . 130

10.4 Illustration with some numerical simulations . . . . . . . . . . . . . . . . . . . . . 131

10.5 Concluding remarks and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . 133


• [B29] S. Adly, L. Bourdin, and F. Caubet. The derivative of a parameterized mechanical con-tact problem with a Tresca’s friction law involves Signorini unilateral conditions. Submitted,2020.

123

CHAPTER 10. THE DERIVATIVE OF A PARAMETERIZED MECHANICAL CONTACT PROBLEMWITH A TRESCA FRICTION LAW INVOLVES SIGNORINI UNILATERAL CONDITIONS

10.1 Introduction

Context in contact mechanics. Contact and friction phenomena with deformable bodies areincreasingly taken into account in industrial models and engineering applications. We can citefor example: wheel-ground contact analysis in aeronautics, assemblies of mechanical processes,etc. In general the mechanical setting consists in a deformable body which is in contact with arigid foundation. The elastic body is deformed under some volumic forces and surface tractionswithout penetrating the rigid foundation. Usually the mathematical models of these mechanicalcontact problems lead to nonlinear boundary value problems, including unilateral (possibly non-smooth) constraints, where the unknowns are the displacement and the stress field. When possi-ble, the corresponding weak mathematical formulations are expressed as variational inequalitiesof the first or second kind. These variational formulations are generally used in order to proveexistence, uniqueness, regularity of solutions as well as for numerical purposes.

The so-called Signorini problem is a mechanical contact problem without friction. It consists infinding the equilibrium configuration of an elastic body in a frictionless contact with a rigid sur-face. This problem was first formulated by Signorini [286] in 1933, and later in 1959 in the pa-per [287]. In 1963, Fichera proved in [235] the existence and uniqueness of the solution to the Sig-norini problem by minimizing the corresponding quadratic potential energy functional. Signorinilaws are expressed as complementarity relations and translate the non-penetrability of the con-tact zone on the obstacle, the non-appearance of traction forces on the contact zone and the com-plementarity of normal forces and displacements. The weak formulation of the Signorini prob-lem can be recast into a variational inequality of the first kind and the literature is abundant onboth theoretical and numerical aspects on this subject. Comprehensive references in this field in-clude [208, 209, 248, 280]. When dealing with frictional contact problems of deformable bodies, aCoulomb friction model was studied by Duvaut and Lions [231]. The main difficulty of this modelcomes from the fact that the friction functional depends on the normal stress of the unknown dis-placement, which leads to a nonvariational problem. The Tresca model can be seen as a simplifiedCoulomb friction law with a given friction threshold (see, e.g., [231]). It can be considered as a firststep towards the treatment of the more complicated mathematical formulation of the Coulombfriction law. The weak formulation of the Tresca friction problem is a variational inequality of thesecond kind involving a nondifferentiable convex integral friction functional. For more detailsabout the formulations of the Tresca and Coulomb models, I refer the reader to [231, 246, 285].

Motivations. In general optimization theory, the sensitivity analysis of the state with respect togiven parameters plays a fundamental role in order to formulate necessary optimality conditionsor for numerical purposes (for the implementation of gradient descent methods for example). Asexplained in the prelude Chapter 8, when I started my collaboration with Adly and Caubet, ourprimary motivation was shape optimization problems, that is determining the optimal design of agiven object for industrial or engineering applications, involving mechanical contact and frictionphenomena. To this aim we gradually unrolled all the underlying issues that allow us to preparethe ground for the treatment of such problems. In particular, dealing with the sensitivity analysisof the state of such problems is a difficult task due to the unilateral (possibly nonsmooth) characterof the models.

The objective of the paper [B29], written with Adly and Caubet, was to provide an original method-ology based on the mathematical tools of convex analysis in order to deal with the sensibility anal-ysis of the Tresca friction problem with respect to right-hand source term perturbations. Preciselywe focused in this paper on the scalar version of the Tresca friction problem which consists in theboundary value problem given by

−∆u +u = f in Ω,|∂nu| ≤ 1 and |u|∂nu +u = 0 on Γ,

(TP)

124


where Ω ⊂ Rd is a nonempty bounded connected open subset of Rd , d ∈ N∗, with a Lipschitzcontinuous boundary Γ := ∂Ω and f ∈ L2(Ω). Considering that the right-hand source term f isperturbed and replaced by ft ∈ L2(Ω), where t ≥ 0 is a parameter, our aim was to study the differ-entiability at t = 0 of the unique solution ut to the parameterized problem (obtained by replacing fby ft , see Problem (TPt ) below) and to express its derivative, denoted by u′

0, as the unique solutionto a new boundary value problem.

Main result. The main result of the collaboration [B29] claims that, for a given right-hand sourceterm ft ∈ L2(Ω) depending on a parameter t ≥ 0, the map t ≥ 0 7→ ut ∈ H1(Ω), where ut stands forthe unique solution to the parameterized Tresca friction problem −∆ut +ut = ft in Ω,

|∂nut | ≤ 1 and |ut |∂nut +ut = 0 on Γ,(TPt )

is differentiable at t = 0, and its derivative u′0 ∈ H1(Ω) is the unique (weak) solution to the Signorini-

type problem

−∆u′0 +u′

0 = f ′0 in Ω,

∂nu′0 = 0 on Γu0

N ,

u′0 = 0 on Γu0

D ,

u′0 ≤ 0, ∂nu′

0 ≤ 0 and u′0∂nu′

0 = 0 on Γu0S−,

u′0 ≥ 0, ∂nu′

0 ≥ 0 and u′0∂nu′

0 = 0 on Γu0S+,

(SP′0)

where the decomposition Γ= Γu0N ∪Γu0

D ∪Γu0S−∪Γu0

S+ depends on u0 (see Theorem 10.21 for details).This result, proved under some appropriate assumptions, establishes a direct link between Trescaand Signorini problems. Roughly speaking, in our context, it emphasizes the fact that Signorinisolutions can be considered as first-order approximations of perturbed Tresca solutions in thefollowing sense: for small values t > 0, the function ut can be approximated in H1-norm by u0 +tu′

0. Such approximations have been numerically computed on some explicit examples in [B29,Section 4] for an illustrative purpose. These numerical simulations are recalled in Section 10.4 ofthe present chapter.

Methodology. The methodology that we used in [B29] is based on mathematical tools of the con-vex analysis community. Firstly, the unique solution to Problem (TPt ) is expressed in terms ofproximal operator as

ut = proxΦ(Ft ),

for all t ≥ 0, where Ft ∈ H1(Ω) stands for the unique solution to the classical Neumann problemassociated to ft (see Problem (8.7) in Chapter 8) and where Φ : H1(Ω) → R stands for the Trescafriction functional given by

Φ : H1(Ω) −→ R

ϕ 7−→∫Γ

∣∣ϕ∣∣ ,

which is a proper closed convex function on the Hilbert space H1(Ω). Hence the differentiabilityof the map t ≥ 0 7→ ut ∈ H1(Ω) at t = 0 is related to the differentiability (in a generalized sense) ofthe proximal operator proxΦ. To this aim, the approach proposed in [B29] is based on the notionof twice epi-differentiability introduced by Rockafellar in [274] and leads to the characterization ofthe derivative u′

0 ∈ H1(Ω) in terms of the proximal operator of the second-order epi-derivative d2eΦ

of Φ, precisely as

u′0 = proxd2

eΦ(u0|F0−u0)(F ′0).

In our paper [B29] we finally proved that the above equality also characterizes the unique (weak)solution to the Signorini problem (SP′

0).

125


10.2 Basics on Mosco epi-convergence and twice epi-differentiability

This section is dedicated to notations and basics on Mosco epi-convergence and twice epi-differ-entiability which are useful in order to enunciate our main result [B29, Theorem 3.16] in Sec-tion 10.3. In this section let H be a given real Hilbert space and let us keep the notations fromset-valued operators theory and convex analysis introduced in Section 9.2.1 of the previous chap-ter. Finally, in the whole section, all limits with respect to τ> 0 will be considered for τ→ 0+. Forthe ease of notations, since no confusion occurs, the notation τ→ 0+ will be omitted.

10.2.1 Mosco epi-convergence

Let (Sτ)τ>0 be a parameterized family of subsets of H. The outer, weak-outer, inner and weak-innerlimits of (Sτ)τ>0 when τ→ 0+ are respectively defined by

limsupSτ := x ∈ H | ∃(tk )k → 0+, ∃(xk )k → x, ∀k ∈N, xk ∈ Stk ,

w-limsupSτ := x ∈ H | ∃(tk )k → 0+, ∃(xk )k * x, ∀k ∈N, xk ∈ Stk ,

liminfSτ := x ∈ H | ∀(tk )k → 0+, ∃(xk )k → x, ∃K ∈N, ∀k ≥ K , xk ∈ Stk ,

w-liminfSτ := x ∈ H | ∀(tk )k → 0+, ∃(xk )k * x, ∃K ∈N, ∀k ≥ K , xk ∈ Stk ,

where → (respectively *) denotes the strong (respectively weak) convergence in H. Note that thefour inclusions

liminfSτ ⊂ limsupSτ ⊂ w-limsupSτ and liminfSτ ⊂ w-liminfSτ ⊂ w-limsupSτ,

always hold true.

Definition 10.1 (Mosco convergence). A parameterized family (Sτ)τ>0 of subsets of H is said to beMosco convergent if

w-limsupSτ ⊂ liminfSτ.

In that case we write M-limSτ := liminfSτ = limsupSτ = w-liminfSτ = w-limsupSτ.

In the sequel let us denote by R := R∪ ±∞. The domain and the epigraph of an extended real-valued functionΦ : H →R defined on H are respectively given by

dom(Φ) := x ∈ H |Φ(x) <+∞ and Epi(Φ) := (x,λ) ∈ H×R |Φ(x) ≤λ.

Recall that the set of all epigraphs on H is stable under outer and inner limits (see, e.g., [278, p.240]).

Definition 10.2 (Mosco epi-convergence). A parameterized family (Φτ)τ>0 of extended real-valuedfunctions defined on H is said to be Mosco epi-convergent if (Epi(Φτ))τ>0 is Mosco convergent. Inthat case, we denote by ME-limΦτ : H → R the extended real-valued function defined on H char-acterized by its epigraph as follows:

Epi(ME-limΦτ) := M-limEpi(Φτ).

Recall the following characterization of Mosco epi-convergence (see, e.g., [211, Proposition 3.19]or [278, Proposition 7.2] for details).

Proposition 10.3. Let Φ be an extended real-valued function defined on H and let (Φτ)τ>0 be a pa-rameterized family of extended real-valued functions defined on H. Then the family (Φτ)τ>0 Moscoepi-converges with Φ = ME-limΦτ if and only if, for all x ∈ H, there exists (xτ)τ>0 → x such thatlimsupΦτ(xτ) ≤Φ(x) and, for all (xτ)τ>0 * x, liminfΦτ(xτ) ≥Φ(x).

126


10.2.2 Twice epi-differentiability

For a givenΦ ∈ Γ0(H), the second-order difference quotient functions are given in this chapter by

∆2τΦ(x|y) : H −→ R∪ +∞

w 7−→ ∆2τΦ(x|y)(w) := Φ(x +τw)−Φ(x)−τ⟨y, w⟩

τ2 ,

for all τ> 0, x ∈ dom(Φ) and y ∈ ∂Φ(x).

Remark 10.4. Rockafellar defined originally in [274] the above second-order difference quotientfunctions with a factor 1

2 in the denominator. The main reason to include this factor is getting thesecond-order epi-derivatives agree with classical second-order derivatives in the case where bothexist. It turns out that this factor has no interest in the next Chapter 11 (see Section 11.3.3 for somedetails). Thus, for the purpose of homogenizing the notations used in the present manuscript, Idecided to omit the factor 1

2 in both Chapters 10 and 11.

Definition 10.5 (Twice epi-differentiability). LetΦ ∈ Γ0(H). We say thatΦ is twice epi-differentiableat x ∈ dom(Φ) for y ∈ ∂Φ(x) if the family (∆2

τΦ(x|y))τ>0 Mosco epi-converges. In that case wedenote by

d2eΦ(x|y) := ME-lim∆2

τΦ(x|y),

which is called the second-order epi-derivative of Φ at x for y .

Example 10.6. Consider H = R and let |·| : R→ R stand for the standard absolute value map. It isclear that |·| ∈ Γ0(R) with

∂ |·| (x) =

−1 if x < 0,[−1,1] if x = 0,1 if x > 0,

for all x ∈R. One can easily see that |·| is twice epi-differentiable at any x ∈R for all y ∈ ∂ |·| (x) withd2

e |·| (x|y) = ιKx,y where

Kx,y :=

R if x 6= 0,R− if x = 0 and y =−1,0 if x = 0 and y ∈ (−1,1),R+ if x = 0 and y = 1,

is a nonempty closed convex subset of R. In particular we have d2e |·| (x|y) ∈ Γ0(R) with

proxd2e |·|(x|y) = projKx,y

,

for all x ∈R and y ∈ ∂ |·| (x).

I conclude this section by recalling two propositions (see, e.g., [276, 278] for the finite-dimensionalcase and [229] for the infinite-dimensional one). I bring to the attention of the reader that Proposi-tion 10.8 below was the key point allowing us to derive our main result [B29, Theorem 3.16] (whichis recalled in Theorem 10.21 in the next section).

Proposition 10.7. Let Φ ∈ Γ0(H). If Φ is twice epi-differentiable at x ∈ dom(Φ) for y ∈ ∂Φ(x), thend2

eΦ(x|y) ∈ Γ0(H).

Proposition 10.8. Let Φ ∈ Γ0(H) and F : R+ → H be a given function. We consider the function u :R+ → H defined by u(t ) := proxΦ(F (t )) for all t ≥ 0. If the conditions

(i) F is differentiable at t = 0;

(ii) Φ is twice epi-differentiable at u(0) for F (0)−u(0);

are both satisfied, then u is differentiable at t = 0 with

u′(0) = proxd2eΦ(u(0)|F (0)−u(0))(F ′(0)).

127


10.3 Main result

My aim in this section is to recall the main contribution of the paper [B29]. As a first step, I providein Section 10.3.1 the functional setting and recall some basic results. Then, a general Signoriniproblem is presented and investigated in Section 10.3.2. A general Tresca friction problem is in-troduced and studied in Section 10.3.3. On this occasion, the subdifferential of the correspond-ing Tresca friction functional is characterized. This is a new result which was obtained in [B29,Lemma 3.13]. Finally, in Section 10.3.4, I recall in Theorem 10.21 the main result [B29, Theo-rem 3.16] which claims, roughly speaking, that the derivative of a parameterized Tresca frictionproblem involves Signorini unilateral conditions.

10.3.1 Functional setting and basic results

In the whole section we fix d ∈N∗ being a positive integer andΩ⊂Rd being a nonempty boundedconnected open subset ofRd with a Lipschitz continuous boundaryΓ := ∂Ω. In what follows C∞

c (Ω)stands for the standard space of infinitely differentiable real functions defined and compactly sup-ported onΩ, and D′(Ω) stands for the corresponding classical distributions space. We will also usethe standard Lebesgue and Sobolev spaces endowed with their usual norms.

Let us fix some function f ∈ L2(Ω). In what follows we will consider the classical Neumann prob-lem given by −∆F +F = f in Ω,

∂nF = 0 on Γ.(NP)

A solution to Problem (NP) is a function F ∈ H1(Ω) which satisfies −∆F +F = f in D′(Ω) and suchthat ∂nF ∈ L2(Γ) with ∂nF (s) = 0 for almost every s ∈ Γ. Let us recall the very classical variationalformulation and the well-posedness of Problem (NP) in the next propositions. We refer to thestandard books [220, 233].

Proposition 10.9. A function F ∈ H1(Ω) is a solution to Problem (NP) if and only if it satisfies thevariational equality given by ∫

Ω∇F ·∇ϕ+

∫Ω

Fϕ=∫Ω

f ϕ,

for all ϕ ∈ H1(Ω).

Proposition 10.10. Problem (NP) admits a unique solution F ∈ H1(Ω).

10.3.2 A general Signorini problem

In the whole section we consider four (possibly empty) measurable pairwise disjoint subsets ΓN,ΓD, ΓS−, ΓS+ of Γ such that the decomposition

Γ= ΓN ∪ΓD ∪ΓS−∪ΓS+,

holds true. The Signorini problem considered in the present chapter has the following form

−∆u +u = f in Ω,∂nu = 0 on ΓN,

u = 0 on ΓD,u ≤ 0, ∂nu ≤ 0 and u∂nu = 0 on ΓS−,u ≥ 0, ∂nu ≥ 0 and u∂nu = 0 on ΓS+.

(SP)

Definition 10.11. Let u ∈ H1(Ω).

(i) The function u is said to be a (strong) solution to Problem (SP) if it satisfies −∆u +u = fin D′(Ω) and ∂nu ∈ L2(Γ) with the four boundary conditions being satisfied almost every-where on Γ.

128


(ii) The function u is said to be a weak solution to Problem (SP) if u ∈ K and u satisfies thevariational inequality given by∫

Ω∇u ·∇(ϕ−u)+

∫Ω

u(ϕ−u) ≥∫Ω

f (ϕ−u),

for all ϕ ∈K , where K is the nonempty closed convex subset of H1(Ω) given by

K := ϕ ∈ H1(Ω) |ϕ≤ 0 on ΓS−, ϕ= 0 on ΓD and ϕ≥ 0 on ΓS+.

Remark 10.12. In Definition 10.11, two different concepts of solutions to Problem (SP) are intro-duced. In fact, using standard techniques of variational formulations, we proved in [B29, Propo-sition 3.8] that a strong solution to Problem (SP) is necessarily a weak solution. However we wereable to prove that a weak solution to Problem (SP) is also a solution in the strong sense, but onlyunder an additional condition in terms of regularity of the weak solution and of the decompositionof Γ. I refer to [B29, Proposition 3.8] for details. Finally, based on the well known characterizationof the projection operator projK onto K , one can easily obtain the existence/uniqueness resultprovided in Proposition 10.13 below.

Proposition 10.13. Problem (SP) admits a unique weak solution given by

u = projK (F ),

where F ∈ H1(Ω) is the unique solution to Problem (NP).

10.3.3 A general Tresca friction problem and the Tresca friction functional

The Tresca friction problem considered in this chapter has the form −∆u +u = f in Ω,|∂nu| ≤ 1 and |u|∂nu +u = 0 on Γ.

(TP)

A solution to Problem (TP) is a function u ∈ H1(Ω) which satisfies −∆u +u = f in D′(Ω) and suchthat ∂nu ∈ L2(Γ) with |∂nu(s)| ≤ 1 and |u(s)|∂nu(s)+u(s) = 0 for almost every s ∈ Γ.

Remark 10.14. Usually, in the Tresca friction law, a general friction threshold g ∈ L2(Γ) with g ≥ 0 isused (replacing the boundary conditions by |∂nu| ≤ g and |u|∂nu+ug = 0 onΓ). In our paper [B29],without loss of generality and for simplicity, we decided to take g ≡ 1.

Proposition 10.15. A function u ∈ H1(Ω) is a solution to Problem (TP) if and only if it satisfies thevariational inequality given by∫

Ω∇u ·∇(ϕ−u)+

∫Ω

u(ϕ−u)+∫Γ

∣∣ϕ∣∣−∫Γ|u| ≥

∫Ω

f (ϕ−u),

for all ϕ ∈ H1(Ω).

Proposition 10.16. Problem (TP) admits a unique solution given by

u = proxΦ(F ),

where F ∈ H1(Ω) is the unique solution to Problem (NP) and Φ ∈ Γ0(H1(Ω)) is the Tresca frictionfunctional defined by

Φ : H1(Ω) −→ R

ϕ 7−→ Φ(ϕ) :=∫Γ

∣∣ϕ∣∣ .

Remark 10.17. Propositions 10.15 and 10.16 can be found in [B29, Propositions 3.11 and 3.12].The proofs are based on standard techniques of variational formulations for the first proposition,and on the well known characterization of the proximal operator proxΦ of Φ for the second one.

129


In order to apply Proposition 10.8 to the above context, our objective in [B29] was to investigate thetwice epi-differentiability of the Tresca friction functional Φ. To this aim we were led in [B29, Sec-tion 3.3] to characterize the subdifferential operator of Φ and to give an expression of the second-order difference quotient functions associated to Φ. Before recalling these two results (which canbe found in [B29, Lemma 3.13 and Proposition 3.14]), we first need to introduce the Auxiliary Prob-lem −∆v + v = 0 in Ω,

∂nv(s) ∈ ∂ |·| (u(s)) on Γ,(APu)

for all u ∈ H1(Ω). A solution to Problem (APu) for some u ∈ H1(Ω) is a function v ∈ H1(Ω) whichsatisfies −∆v + v = 0 in D′(Ω) and such that ∂nv ∈ L2(Γ) with ∂nv(s) ∈ ∂ |·| (u(s)) for almost all s ∈ Γ.

Lemma 10.18. It holds that

∂Φ(u) = the set of solutions to Problem (APu),

for all u ∈ H1(Ω).

Proposition 10.19. The second-order difference quotient functions associated to the Tresca frictionfunctional Φ satisfy

∆2τΦ(u|v)(w) =

∫Γ∆2τ |·| (u(s)|∂nv(s))(w(s)) d s,

for all τ> 0, u ∈ H1(Ω), v ∈ ∂Φ(u) and w ∈ H1(Ω).

Remark 10.20. If Φ is twice epi-differentiable at u ∈ H1(Ω) for some v ∈ ∂Φ(u), one can naturallyexpect from Proposition 10.19 that its second-order epi-derivative satisfies

d2eΦ(u|v)(w) =

∫Γ

d2e |·| (u(s)|∂nv(s))(w(s)) d s,

for all w ∈ H1(Ω). The above formula, which corresponds from Proposition 10.19 to the inver-sion of the ME-lim symbol and the

∫Γ symbol, remains an open and challenging question which is

postponed to future research works. Nevertheless I refer to [B29, Remark 3.18 and Appendix B] inwhich the proof of the above formula is provided in some particular cases where u is a solution tothe Tresca friction problem and v = F −u ∈ ∂Φ(u). In the treated cases, the proof is based on thecharacterization of Mosco epi-convergence recalled in Proposition 10.3, on the application of ar-guments from the Lebesgue integration theory and also on some regularity assumptions (such asthe continuity of u over the boundary Γ). I emphasize that the proof detailed in [B29, Appendix B]is technical and requires the use of a nontrivial dilatation technique. As a conclusion of this re-mark, I mention the work [261] in which, with the same spirit, the author studied the inversionof the ME-lim symbol and the

∫Ω symbol over the L2(Ω)-space. Note that this previous work can-

not be applied to our context since we consider here the H1(Ω)-space and the∫Γ symbol over the

boundary Γ (instead of the∫Ω symbol over the setΩ in [261]).

10.3.4 The derivative of a parameterized Tresca friction problem

The parameterized Tresca friction problem considered in this chapter is given by −∆ut +ut = ft in Ω,|∂nut | ≤ 1 and |ut |∂nut +ut = 0 on Γ,

(TPt )

where ft ∈ L2(Ω) for all t ≥ 0. From Proposition 10.16, Problem (TPt ) has a unique solution ut ∈H1(Ω) given by

ut = proxΦ(Ft ),

for all t ≥ 0, where Ft ∈ H1(Ω) is the unique solution to the parameterized Neumann problem −∆Ft +Ft = ft in Ω,∂nFt = 0 on Γ.

(NPt )

130


Theorem 10.21. Consider that the following assumptions are both satisfied:

(A1) the map t ≥ 0 7→ ft ∈ L2(Ω) is differentiable at t = 0, with derivative denoted by f ′0 ∈ L2(Ω);

(A2) Φ is twice epi-differentiable at u0 for F0 −u0 with

d2eΦ(u0|F0 −u0)(w) =

∫Γ

d2e |·| (u0(s)|∂n(F0 −u0)(s))(w(s)) d s,

for all w ∈ H1(Ω).

Then the map t ≥ 0 7→ ut ∈ H1(Ω) is differentiable at t = 0 and its derivative denoted by u′0 ∈ H1(Ω)

is the unique weak solution to the Signorini problem given by

−∆u′0 +u′

0 = f ′0 in Ω,

∂nu′0 = 0 on Γu0

N ,

u′0 = 0 on Γu0

D ,

u′0 ≤ 0, ∂nu′

0 ≤ 0 and u′0∂nu′

0 = 0 on Γu0S−,

u′0 ≥ 0, ∂nu′

0 ≥ 0 and u′0∂nu′

0 = 0 on Γu0S+,

(SP′0)

where

Γu0N := s ∈ Γ | u0(s) 6= 0, Γ

u0D := s ∈ Γ | u0(s) = 0 and ∂nu0(s) ∈ (−1,1),

Γu0S− := s ∈ Γ | u0(s) = 0 and ∂nu0(s) = 1, Γ

u0S+ := s ∈ Γ | u0(s) = 0 and ∂nu0(s) =−1.

Remark 10.22. Theorem 10.21 can be found in [B29, Theorem 3.16]. Its proof is based on thefollowing arguments. From Example 10.6, it holds that

d2eΦ(u0|F0 −u0)(w) =

∫ΓιKu0(s),∂n(F0−u0)(s) (w(s)) d s = ιK0 (w),

where, since ∂nF0 = 0 on Γ, we have

K0 := ϕ ∈ H1(Ω) |ϕ(s) ∈ Ku0(s),∂n(F0−u0)(s) a.e. s ∈ Γ

= ϕ ∈ H1(Ω) |ϕ≤ 0 on Γu0S−, ϕ= 0 on Γu0

D and ϕ≥ 0 on Γu0S+.

Finally the application of Propositions 10.8 and 10.13 concludes.

Remark 10.23. Assumption (A2) in Theorem 10.21 is strong. I refer to Remark 10.20 for a discus-sion on that technical point.

10.4 Illustration with some numerical simulations

My aim in this section is to illustrate Theorem 10.21 with some numerical simulations extractedfrom [B29, Section 4] and performed using FREEFEM++ [239]. In order to approximate the Sig-norini problem, we used the iterative switching algorithm introduced by Aitchison and Poolein [206]. A revisited iterative switching algorithm was used for the numerical resolution of theparameterized Tresca friction problem. I refer to [B29, Appendix C] for more details. In full agree-ment with Theorem 10.21, these simulations underline that, for a given small t > 0, the uniquesolution ut to the parameterized Tresca friction problem (TPt ) can be approximated by u0 + tu′

0,where u′

0 is the unique weak solution to the Signorini problem (SP′0).

131


Two-dimensional simulations. Let d = 2 and Ω be the unit open disk of R2. Then, for all t ≥ 0,we consider the function ft ∈ L2(Ω) defined by ft (x, y) := e t f (x, y) where

f (x, y) := 1

2(x2 + y2 −5)h(x)−2xh′(x)− 1

2(x2 + y2 −1)h′′(x),

for almost all (x, y) ∈Ω, where

h(x) :=

−1 if −1 < x ≤−1

2 ,

sin(πx) if −12 ≤ x ≤ 1

2 ,

1 if 12 ≤ x < 1,

for all x ∈ (−1,1). Our choice of such a function f = f0 is justified by the fact that we are able, inthis case, to express analytically the exact solution u0 to Problem (TP0), which is given by

u0(x, y) := 1

2(x2 + y2 −1)h(x),

for all (x, y) ∈ Ω. On the other hand the choice of the expression of the function h is justifiedby the fact that it provides an example in which the decomposition Γ = Γ

u0N ∪Γu0

D ∪Γu0S− ∪Γu0

S+ isnot trivial in the sense that Γu0

S−∪Γu0S+ has a positive measure, which guarantees in the Signorini

problem (SP′0) the presence of parts of the boundary with Signorini-type conditions. Indeed, one

can easily deduce from the expression of u0 that

Γu0S+ =

(x, y) ∈ Γ | x ≤ 1

2

, Γ

u0D =

(x, y) ∈ Γ | −1

2< x < 1

2

and Γ

u0S− =

(x, y) ∈ Γ | x ≥ 1

2

.

In order to illustrate Theorem 10.21, we first compute numerically the solutions u0 and u′0. Then,

for several small values t > 0, we compute numerically the solution ut and compare it with u0+tu′0

(using the H1-norm). We concatenate our results in Table 10.1, and Figure 10.1 gives the represen-tation of the H1-comparison with respect to t in logarithmic scale. Figure 10.2 concludes thisparagraph with the illustration of the case t = 0.01.

t 0.40 0.20 0.1 0.075 0.05 0.025 0.01 0.0075 0.005 0.0025∥∥ut−u0−tu′0

∥∥H1(Ω) 0.6528 0.2360 0.0909 0.0616 0.0360 0.0138 0.0040 0.0029 0.0021 0.0016

Table 10.1 – H1-norm of the difference between ut and its first-order approximation u0 + tu′0.

Figure 10.1 – The H1-comparison∥∥ut −u0 − tu′

0

∥∥H1(Ω) with respect to t in logarithmic scale.

Three-dimensional simulations. Let d = 3 and Ω be the cube (0,1)3. We use here the parame-terized function ft ∈ L2(Ω) (chosen haphazardly) defined by

ft (x, y, z) := sin(t )xe y e2z +px cos(x y2)ez +25(z −1),

for all (x, y, z) ∈Ω and all t ≥ 0. Figure 10.3 illustrates the solution ut for t = 0.1 and its first-orderapproximation u0 + tu′

0. Here we obtain∥∥ut −u0 − tu′

0

∥∥H1(Ω) = 0.000575736.

132


IsoValue-0.449283-0.385078-0.342275-0.299472-0.256669-0.213866-0.171063-0.12826-0.0854566-0.04265350.0001496230.04295270.08575580.1285590.1713620.2141650.2569680.2997710.3425740.449582

The solution of the Tresca problem with f(0.01)

IsoValue-0.449168-0.38498-0.342189-0.299397-0.256605-0.213813-0.171021-0.12823-0.0854377-0.04264590.0001459020.04293770.08572950.1285210.1713130.2141050.2568970.2996890.342480.44946

Approximation of the solution of the Tresca problem with f(0.01)

Figure 10.2 – Case t = 0.01 (d = 2): solution ut (left) and its first-order approximation u0 + tu′0 (right).

Figure 10.3 – Case t = 0.1 (d = 3): solution ut (left) and its first-order approximation u0 + tu′0 (right).

10.5 Concluding remarks and perspectives

It is worth noticing that we focused in [B29] on a scalar version of the Tresca friction problembut we are confident that our methodology can be extended in the same manner to the linearelasticity case. On the other hand it is well known that the Tresca friction law is an approximationof the more realistic Coulomb’s one. This may open possibilities for further extensions to quasi-variational inequalities and to time-dependent processes in contact mechanics. This could be thesubject of forthcoming research projects.

In our opinion no difficulty would arise by extending our approach in [B29] to a general frictionthreshold g ∈ L2(Γ) with g ≥ 0 (see Remark 10.14). Note that the perturbation in Problem (TPt )concerns only the right-hand source term ft ∈ L2(Ω). An interesting extension, in particular forthe needs of the research project presented in the prelude Chapter 8, is to consider a perturbedfriction threshold g t ∈ L2(Γ) depending on the parameter t ≥ 0. In that case, the Tresca frictionfunctional Φt would also depend on the parameter t ≥ 0 and thus the definition of twice epi-differentiability recalled in Section 10.2 should be adjusted. This is exactly the content of the pa-per [B26] in collaboration with Adly which is summarized in the next Chapter 11.

In our paper [B29], we have proved, roughly speaking, that the derivative of a parameterized Trescafriction problem (which corresponds to a variational inequality of the second kind involving apositively homogeneous convex functional Φ of degree one) is a Signorini problem (which corre-sponds to a variational inequality of the first kind and can be seen as a complementarity problem).As a conclusion I mention that a work in progress, still in collaboration with Adly, aims to prove thefollowing more general result: the derivative of a variational inequality of the second kind involv-ing a positively homogeneous convex functional of degree one, under the additional assumptionthat its subdifferential operator at the origin is polyhedric in the Haraux sense [237] (note thatwe were not able for now to prove or refute this condition in the case of the Tresca friction func-tional Φ), is a complementarity problem involving the tangent cone of the normal cone to thesubdifferential operator of the convex functional at the origin.

133


134

Chapter 11

Sensitivity analysis of variationalinequalities via twice epi-differentiabilityand proto-differentiability of theproximal operator


11.2 Objective of the paper [B26] and preliminaries . . . . . . . . . . . . . . . . . . . . 138

11.2.1 Additional reminders on convergence notions and convex analysis . . . . . 138

11.2.2 Main objective and extensions of generalized differentiability notions . . . . 139

11.3 Main contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

11.3.1 Convergent supporting hyperplane . . . . . . . . . . . . . . . . . . . . . . . . 142

11.3.2 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

11.3.3 Comments on the choice of Formula (11.1) . . . . . . . . . . . . . . . . . . . . 143

11.4 Applications and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

11.4.1 Applications to parameterized convex minimization problems . . . . . . . . 144

11.4.2 A work in progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145


• [B26] S. Adly and L. Bourdin. Sensitivity analysis of variational inequalities via twice epi-differentiability and proto-differentiability of the proximity operator. SIAM J. Optim., 28(2):1699–1725, 2018.

135

CHAPTER 11. SENSITIVITY ANALYSIS OF VARIATIONAL INEQUALITIES VIA TWICEEPI-DIFFERENTIABILITY AND PROTO-DIFFERENTIABILITY OF THE PROXIMAL OPERATOR

11.1 Introduction

Let H be a real Hilbert space and let ⟨·, ·⟩ (resp. ‖ · ‖) be the corresponding scalar product (resp.norm). All along this chapter we will use the notations already introduced in the previous Chap-ters 9 and 10.

Bibliographical context. One of the most famous optimization problem consists in finding apoint y of a nonempty set K ⊂ H that minimizes the distance to a given point x ∈ H. If K is anonempty closed convex set, it is well known that the above problem admits a unique solution,denoted by y = projK(x), called the projection of x onto K, and characterized by the following vari-ational inequality

∀z ∈ K, ⟨y, z − y⟩ ≥ ⟨x, z − y⟩.As a first step towards the sensitivity analysis of the above variational inequality, one would con-sider a slight perturbation of the point x ∈ H, replacing it by x(t ) ∈ H where t ≥ 0 is a parameter. Thebehavior of y(t ) = projK(x(t )) was well studied in the literature and is naturally connected to thedifferentiability properties (in some general sense) of the projection operator projK. We mentionfor instance the work of Zarantonello [295] who had shown the existence of directional derivativesfor projK on the boundary of K. However it is well known that, outside the set K, the directionaldifferentiability of projK is not guaranteed (see [283] for a two-dimensional counter-example). Onthe other hand, Haraux [237] and Mignot [255] published two fundamental papers where the con-ical differentiability of the projection operator is proved under a polyhedric assumption. In thatcase, an asymptotic development of y(t ) at t = 0 can be obtained and the derivative y ′(0) can beexpressed as the projection of x ′(0) onto a closed subset of the tangent cone of K at y(0). I referto [236, 241, 284] for other references concerning the differentiability of the projection operator.

The theory of variational inequalities, initiated by Signorini [286, 287] and Fichera [235] with mo-tivations in contact mechanics, has been developed in the 1970’s by the French and Italian schoolssuch as Lions and Stampacchia [249, 250, 291, 292], Brézis [217, 218], Mosco [260], among others.In this introduction let us focus on a general nonlinear variational inequality of the second kindwhich consists in finding y ∈ H such that

∀z ∈ H, ⟨A(y), z − y⟩+ f (z)− f (y) ≥ ⟨x, z − y⟩, (VI(A, f , x))

where A : H → H is a (possibly nonlinear) operator, f ∈ Γ0(H) and x ∈ H is given. In the casewhere A = I is the identity operator, (VI(A, f , x)) admits a unique solution given by y = prox f (x).Otherwise, in the general case of a possibly nonlinear operator A, assumed to be Lipschitz contin-uous and strongly monotone, it can be proved (see [217, Proposition 31]) that (VI(A, f , x)) admitsa unique solution denoted by y = proxA, f (x), where proxA, f := (A +∂ f )−1 can be seen as a gener-alization of the proximal operator prox f .

A first step towards the sensitivity analysis of (VI(A, f , x)) is to consider a slight perturbation ofthe point x ∈ H, replacing it by x(t ) ∈ H where t ≥ 0 is a parameter, and to study the behaviorof y(t ) = proxA, f (x(t )). The key point in the understanding of the differentiability properties ofthe generalized proximal operator proxA, f is to investigate generalized second-order differenti-ation theory. Precisely Rockafellar introduced and thoroughly studied the notions of twice epi-differentiability [274] and proto-differentiability [275] of functions and set-valued maps respec-tively. These notions are both based on the Painlevé–Kuratowski convergence of epigraphs andgraphs of difference quotients. In the particular case where A = I is the identity operator and H isfinite-dimensional, thanks to the Attouch theorem [211], Rockafellar proved in [276] the equiva-lence between the twice epi-differentiability of f , the proto-differentiability of the subdifferentialoperator ∂ f and the proto-differentiability of the proximal operator prox f (see also [278, Chap-ter 13]). Note that Do [229] extended all these results to the infinite-dimensional Hilbert settingusing the concept of Mosco convergence [259]. Finally, from the twice epi-differentiability of f , it

136


can be derived that y(t ) = prox f (x(t )) is differentiable at t = 0 (see Proposition 10.8). In the gen-eral case of a possibly nonlinear operator A, the additional assumption of semi-differentiabilityof A (notion introduced by Penot [263]) allows to prove the differentiability of y(t ) = proxA, f (x(t ))at t = 0. Moreover y ′(0) can be expressed as the image of x ′(0) under a generalized proximal oper-ator involving the semi-derivative of A and the second-order epi-derivative of f .

For the calculus rules of epi-derivatives, we refer to the works of Poliquin and Rockafellar [268, 269]and references therein. In particular it can be proved that Haraux and Mignot polyhedric assump-tion on a nonempty closed convex set K is a sufficient condition for the twice epi-differentiabilityof the corresponding indicator function ιK. We refer to the discussion in [229, Definition 2.8 andExample 2.10]. As a consequence, Rockafellar’s result encompasses the one by Haraux and Mignot.

Contributions of the paper [B26]. The paper [B26] jointly with Adly and was motivated by alarger research project dealing with shape optimization problems involved in contact mechanics(see details in Chapter 8). The precise challenge of [B26] was to investigate the sensitivity analysisof (VI(A, f , x)) that takes into account a perturbation on all the data of the problem, not only on xbut also on A and f . Precisely we considered the following perturbed variational inequality whichconsists in finding y(t ) ∈ H such that

∀z ∈ H, ⟨A(t , y), z − y⟩+ f (t , z)− f (t , y) ≥ ⟨x(t ), z − y⟩, (VI(A(t , ·), f (t , ·), x(t )))

where A(t , ·), f (t , ·) and x(t ) satisfy some appropriate assumptions. Note that the solution to(VI(A(t , ·), f (t , ·), x(t ))) is given by

y(t ) = proxA(t ,·), f (t ,·)(x(t )).

Our main objective in [B26] was to provide sufficient conditions on A(t , ·), f (t , ·) and x(t ) underwhich y(t ) is differentiable at t = 0 and to provide an explicit formula for y ′(0). As mentionedin the previous paragraph, Rockafellar already dealt with the t-independent framework, that is,with the particular case where A(t , ·) = A and f (t , ·) = f are t-independent. The concepts in-troduced by Rockafellar cannot be directly applied to the t-dependent framework. Therefore weextended in [B26, Section 3] (recalled in Section 11.2) the notions of twice epi-differentiability,semi-differentiability and proto-differentiability to the case of parameterized functions, single-and set-valued maps respectively.

Note that the extension of the notion of twice epi-differentiability to the t-dependent frameworkis not a trivial replica and specific features emerge in that context which require adjustments. Irefer to the beginning of Section 11.3 and to Section 11.3.3 for details. In particular a new conceptcalled convergent supporting hyperplane has been introduced in [B26, Definition 4.5] (recalledin Definition 11.18) which is efficient in order to prove the equivalence between the twice epi-differentiability of f (t , ·) and the proto-differentiability of its subdifferential operator ∂ f (t , ·) :=∂( f (t , ·)) (see Theorem 11.19). Actually, in that context, the existence of a convergent supportinghyperplane to the second-order difference quotient functions associated to f (t , ·) was shown to beequivalent to the belongship of its second-order epi-derivative to Γ0(H) (see Remark 11.23).

Finally, using all the above mentioned new notions, we were able in [B26, Theorem 4.15] (recalledin Theorem 11.24) to prove that y(t ) = proxA(t ,·), f (t ,·)(x(t )) is differentiable at t = 0. Moreover weobtained that y ′(0) can be expressed as the image of x ′(0) under a generalized proximal operatorinvolving the semi-derivative of A(t , ·) and the second-order epi-derivative of f (t , ·). In particu-lar y ′(0) is thus the unique solution to the associated variational inequality (see Remark 11.25).

Our main result [B26, Theorem 4.15] encompasses and extends Rockafellar’s work, and thus theresults of Haraux and Mignot mentioned in the previous paragraph. I conclude this paragraphby mentioning that a very recent work [204] by Adly and Rockafellar extends the results obtainedin [B26] to the case where the generalized proximal operator proxA(t ,·), f (t ,·) has been replaced by

a generalized resolvent operator JA(t ,·),B(t ,·) := (A(t , ·)+B(t , ·))−1 associated to two parameterizedsingle- and set-valued maps A(t , ·) and B(t , ·) satisfying some appropriate and related assump-tions.

137


Applications and perspectives. Section 11.4 is dedicated to some applications and perspectives.Indeed we investigated in [B26, Section 5] the sensitivity analysis of parameterized convex opti-mization problems. The results obtained inthere are summarized in Section 11.4.1 of the presentchapter. Additionally we conclude Section 11.4 by turning back to the open question raised inSection 10.5 of the previous chapter. Precisely we investigate the sensitivity analysis of a parame-terized scalar Tresca friction problem which involves a perturbed friction threshold. According tofirst computations made in collaboration with Caubet, the results obtained in [B26], and summa-rized in the present chapter, are suitable in order to carry out such a sensitivity analysis. I refer toSection 11.4.2 for details.

11.2 Objective of the paper [B26] and preliminaries

For the needs of the present chapter, some notations and reminders are provided in Section 11.2.1.The main goal of the paper [B26] is presented in the next Section 11.2.2.

11.2.1 Additional reminders on convergence notions and convex analysis

In what follows we preserve the notations introduced in the previous Chapters 9 and 10. For theneeds of the present chapter, additional reminders on some convergence notions and convex anal-ysis are provided. I refer the reader to standard books such as [211, 242, 264, 278] and referencestherein.

Definition 11.1 (Painlevé–Kuratowski convergence). A parameterized family (Sτ)τ>0 of subsetsof H is said to be Painlevé–Kuratowski convergent if

limsupSτ ⊂ liminfSτ.

In that case we write PK-limSτ := liminfSτ = limsupSτ.

Remark 11.2. Note that the Painlevé–Kuratowski convergence is a weaker notion than the Moscoconvergence (recalled in Definition 10.1). However they coincide when H is finite-dimensional.

Definition 11.3 (Graphical convergence). A parameterized family (Aτ)τ>0 of set-valued maps on His said to be graphically convergent if (Gr(Aτ))τ>0 is Painlevé–Kuratowski convergent. In that casewe denote by G-lim Aτ : H⇒H the set-valued map characterized by its graph as follows:

Gr(G-lim Aτ) := PK-limGr(Aτ).

In the next theorems, I recall two results proved by Attouch in [211, Theorem 3.66 and Corol-lary 3.65] which play a crucial role in this chapter.

Theorem 11.4. Let ( fτ)τ>0 be a parameterized family of functions in Γ0(H) and let f ∈ Γ0(H). Then,f = ME-lim fτ if and only if the following assertions are both satisfied:

(i) ∂ f = G-lim∂ fτ;

(ii) there exists (zτ,ξτ)τ>0 → (z,ξ) such that (zτ,ξτ) ∈ Gr(∂ fτ) for all τ> 0 and fτ(zτ) → f (z).

Theorem 11.5. Let ( fτ)τ>0 be a parameterized family of functions in Γ0(H). If (∂ fτ)τ>0 is graphicallyconvergent with a maximal monotone limit G-lim∂ fτ, then there exists a function f ∈ Γ0(H) suchthat G-lim∂ fτ = ∂ f .

For a single-valued map A : H → H, we say that A is Lipschitz continuous if

∃M ≥ 0, ∀x1, x2 ∈ H, ‖A(x2)− A(x1)‖ ≤ M‖x2 −x1‖,

138


and we say that A is strongly monotone if

∃α> 0, ∀x1, x2 ∈ H, ⟨A(x2)− A(x1), x2 −x1⟩ ≥α‖x2 −x1‖2.

In the whole chapter we denote by A (H) the set of all single-valued maps A : H → H that areLipschitz continuous and strongly monotone. The generalized proximal operator associated to apair (A, f ) ∈A (H)×Γ0(H) is defined by

proxA, f := (A+∂ f )−1.

From the contraction mapping principle, it can be proved that proxA, f : H → H is a single-valuedmap (see, e.g., [217, Proposition 31]) which is Lipschitz continuous.

Let f ∈ Γ0(H). It follows from the Brøndsted–Rockafellar theorem (see, e.g., [265, Theorem 6.5])that D(∂ f ) 6= ;, and thus f admits a supporting hyperplane, that is,

∃(z,ξ,β) ∈ H×H×R such that

∀w ∈ H, f (w) ≥ ⟨ξ, w⟩+β,

f (z) = ⟨ξ, z⟩+β.

11.2.2 Main objective and extensions of generalized differentiability notions

In what follows we denote by

• A (·,H) the set of all parameterized single-valued maps A : R+ ×H → H such that A(t , ·) ∈A (H) for all t ≥ 0;

• Γ0(·,H) the set of all parameterized extended-real-valued functions f : R+×H → R∪ +∞such that f (t , ·) ∈ Γ0(H) for all t ≥ 0.

Let (A, f ) ∈ A (·,H)×Γ0(·,H) and x : R+ → H be a given function. Our main objective in [B26] wasto derive sufficient conditions on A, f and x under which the function y :R+ → H defined by

y(t ) := proxA(t ,·), f (t ,·)(x(t )),

for all t ≥ 0, is differentiable at t = 0 and to provide an explicit formula for y ′(0). In the literature,notions of semi-differentiability, twice epi-differentiability and proto-differentiability have beenintroduced respectively in [263], [274] and [275]. These wonderful tools turned out to be sufficientin order to deal with the case where A and f are t-independent. We refer to [276, 278] for thefinite-dimensional case and to [229] for the infinite-dimensional Hilbert setting. As a first steptowards the t-dependence of A and f , we needed in [B26, Section 3] to extend the three notionsof semi-differentiability, twice epi-differentiability and proto-differentiability to the t-dependentsetting. This section is dedicated to recall these generalizations.

Definition 11.6 (Semi-differentiability). Let A :R+×H → H be a parameterized single-valued mapand let x ∈ H. If the limit

Ds A(x)(w) := limτ→0

w ′→w

A(τ, x +τw ′)− A(0, x)

τ

exists in H for all w ∈ H, we say that A is semi-differentiable at x. In that case Ds A(x) : H → H is asingle-valued map called the semi-derivative of A at x.

Definition 11.7 (Proto-differentiability). Let A : R+×H ⇒ H be a parameterized set-valued map.We introduce the first-order difference quotient set-valued maps given by

∆τA(x|v) : H ⇒ H

w 7→ ∆τA(x|v)(w) := A(τ, x +τw)− v

τ,

139


for all τ > 0, x ∈ H and v ∈ A(0, x). We say that A is proto-differentiable at x ∈ H for v ∈ A(0, x)if (∆τA(x|v))τ>0 graphically converges. In that case we denote by

Dp A(x|v) := G-lim∆τA(x|v)

the set-valued map Dp A(x|v) : H⇒H called the proto-derivative of A at x for v .

Definition 11.8 (Twice epi-differentiability). Let f ∈ Γ0(·,H). We introduce the second-order dif-ference quotient functions given by

∆2τ f (x|v) : H −→ R∪ +∞

w 7−→ ∆2τ f (x|v)(w) := f (τ, x +τw)− f (τ, x)−τ⟨v, w⟩

τ2 ,

(11.1)

for all τ > 0, x ∈ dom( f ) := ∩t≥0dom( f (t , ·)) and v ∈ ∂ f (0, x) := ∂( f (0, ·))(x). We say that f is twiceepi-differentiable at x ∈ dom( f ) for v ∈ ∂ f (0, x) if (∆2

τ f (x|v))τ>0 Mosco epi-converges. In that casewe denote by

d2e f (x|v) := ME-lim∆2

τ f (x|v)

the extended-real-valued function d2e f (x|v) : H →R called the second-order epi-derivative of f at x

for v .

Remark 11.9. If the single-valued map A is t-independent in Definition 11.6, we recover the classi-cal notion of semi-differentiability originally introduced in [263]. Similarly, if the set-valued map Ais t-independent in Definition 11.7, we recover the classical notion of proto-differentiability origi-nally introduced in [275]. Similarly again, if the extended-real-valued function f is t-independentin Definition 11.8, then we recovers the classical notion of twice epi-differentiability originally in-troduced in [274] (up to the multiplicative constant 1

2 , see Section 11.3.3 for a justification of thisminor change).

Remark 11.10. I mention here that the above generalizations are not sufficient in order to fullyadapt to the t-dependent framework the sensitivity analysis performed in [229, 276, 278]. In-deed, Remark 11.14 below shows that the situation is more intricate in the t-dependent settingand needs adjustments. I refer to the beginning of Section 11.3 for a detailed discussion on thattechnical point and for the introduction of a suitable and new concept called convergent support-ing hyperplane. Finally let me mention that, in contrary to the t-independent framework, theabove twice epi-differentiability of a parameterized function f ∈ Γ0(·,H) cannot be directly relatedto the twice epi-differentiability of its conjugate function f ∗. I refer to [B26, Section 6.2] for moredetails on that point which will not be discussed in the present manuscript.

Remark 11.11. The above extensions of the two notions of semi-differentiability and proto-differ-entiability, from the t-independent framework to the t-dependent one, are utterly natural. In con-trast, note that the extension of the notion of twice epi-differentiability, from the t-independentframework to the t-dependent one, can actually be done in several different ways. I refer to Sec-tion 11.3.3 which is devoted to a discussion justifying our choice of Formula (11.1).

Example 11.12. Let H = R and f (t , x) := |x − t | for all (t , x) ∈ R+×R. Let us consider x = 0 and v =0 ∈ ∂ f (0, x). One can easily compute that

∆2τ f (x|v)(w) = |w −1|−1

τ,

for all τ> 0 and all w ∈R. One can easily deduce that f is twice epi-differentiable at x for v with

d2e f (x|v)(w) =

−∞ if w ∈ [0,2],+∞ if w ∉ [0,2].

140


Example 11.13. Let H =R and f (t , x) := |x − t 2| for all (t , x) ∈R+×R. Let us consider x = 0 and v =0 ∈ ∂ f (0, x). One can easily compute that

∆2τ f (x|v)(w) = |w −τ|−τ

τ,

for all τ> 0 and all w ∈R. One can easily deduce that f is twice epi-differentiable at x for v with

d2e f (x|v)(w) =

−1 if w = 0,+∞ if w 6= 0.

Remark 11.14. Let f ∈ Γ0(·,H) be twice epi-differentiable at x ∈ dom( f ) for v ∈ ∂ f (0, x). Evenif ∆2

τ f (x|v) is with values in R∪ +∞ for all τ > 0, it may be possible that there exists w ∈ Hsuch that d2

e f (x|v)(w) = −∞. In particular it might be possible that d2e f (x|v) ∉ Γ0(H) (see Ex-

ample 11.12). It means that Proposition 10.7 has no counterpart in the t-dependent setting. Thisis an important difference which requires careful attention.

Remark 11.15. Let f ∈ Γ0(·,H) be twice epi-differentiable at x ∈ dom( f ) for v ∈ ∂ f (0, x). It mightbe possible that d2

e f (x|v) is not positively homogeneous of degree two (see Example 11.13). Thisis a second difference with respect to the t-independent framework.

11.3 Main contributions

The next lemma, whose proof is a simple adaptation to the t-dependent framework of the t-independent case (that can be found in [277, Proposition 2.7]), relates the subdifferential operatorof a second-order difference quotient functions associated to a parameterized function f ∈ Γ0(·,H)with the first-order difference quotient set-valued maps associated to its parameterized subdiffer-ential operator ∂ f :R+×H⇒H.

Lemma 11.16. Let f ∈ Γ0(·,H), x ∈ dom( f ) and v ∈ ∂ f (0, x). Then ∆2τ f (x|v) belongs to Γ0(H) and

∂(∆2τ f (x|v)

)=∆τ(∂ f )(x|v),

for all τ> 0.

From Lemma 11.16 and Attouch theorems (Theorems 11.4 and 11.5), one can expect that the twiceepi-differentiability of a parameterized function f ∈ Γ0(·,H) is strongly related to the proto-differ-entiability of its parameterized subdifferential operator ∂ f : R+ ×H ⇒ H. In the t-independentframework, the next proposition has been established in [229, 276, 278].

Proposition 11.17. Let f ∈ Γ0(H) (that is t-independent), x ∈ dom( f ) and v ∈ ∂ f (x). The followingassertions are equivalent:

(i) f is twice epi-differentiable at x for v;

(ii) ∂ f is proto-differentiable at x for v and Dp (∂ f )(x|v) is a maximal monotone operator.

In that case d2e f (x|v) belongs to Γ0(H) with d2

e f (x|v)(0) = 0 and

∂(d2e f (x|v)) = Dp (∂ f )(x|v).

However, we deduce from Remark 11.14 that Proposition 11.17 does not admit an exact coun-terpart in the t-dependent framework. Hence, our aim in [B26, Section 4.1] was to introduce anew concept, called convergent supporting hyperplane, that allows to extend Proposition 11.17 tothe t-dependent framework. This condition actually turns out to be necessary and sufficient in asense that can be made precise. I refer to Section 11.3.1 for details. Then Section 11.3.2 is devotedto recall our major contribution in [B26] (see Theorem 11.24) concerning the initial motivationpresented in Section 11.2.2 .

141


11.3.1 Convergent supporting hyperplane

The notion of supporting hyperplane has been recalled in Section 11.2.1. The new notion belowwas originally introduced in [B26, Definition 4.5].

Definition 11.18 (Convergent supporting hyperplane). Let ( fτ)τ>0 be a parameterized family offunctions in Γ0(H). We say that ( fτ)τ>0 admits a convergent supporting hyperplane if there ex-ists (zτ,ξτ,βτ)τ>0 → (z,ξ,β) such that (zτ,ξτ,βτ) is a supporting hyperplane of fτ for all τ> 0.

We are now in a position to enunciate a counterpart of Proposition 11.17 in the t-dependentframework, under the assumption of the existence of a convergent supporting hyperplane. Thefollowing theorem has been proved in [B26, Theorem 4.7].

Theorem 11.19. Let f ∈ Γ0(·,H), x ∈ dom( f ) and v ∈ ∂ f (0, x). Assume that (∆2τ f (x|v))τ>0 admits a

convergent supporting hyperplane (zτ,ξτ,βτ)τ>0 → (z,ξ,β). Then the following assertions are equiv-alent:

(i) f is twice epi-differentiable at x for v;

(ii) ∂ f is proto-differentiable at x for v and Dp (∂ f )(x|v) is a maximal monotone operator.

In that case d2e f (x|v) belongs to Γ0(H) with β≤ d2

e f (x|v)(0) ≤ 0 and

∂(d2e f (x|v)) = Dp (∂ f )(x|v).

Remark 11.20. In the t-independent framework, Theorem 11.19 exactly coincides with Propo-sition 11.17. Indeed, in that context, one can easily see that (∆2

τ f (x|v))τ>0 admits a convergentsupporting hyperplane given by (zτ,ξτ,βτ) = (0,0,0) for all τ> 0.

Remark 11.21. In Example 11.12, note that (∆2τ f (x|v))τ>0 does not admit a convergent supporting

hyperplane.

Remark 11.22. In contrary to Proposition 11.17, it might be possible that d2e f (x|v)(0) 6= 0 in Theo-

rem 11.19 (see Example 11.13 for instance).

Remark 11.23. In [B26, Proposition 4.12], we proved that the existence of a convergent supportinghyperplane is also a necessary condition for the assertions of Theorem 11.19. Precisely we provedthat, for f ∈ Γ0(·,H) twice epi-differentiable at x ∈ dom( f ) for v ∈ ∂ f (0, x), then we have d2

e f (x|v) ∈Γ0(H) if and only if (∆2

τ f (x|v))τ>0 admits a convergent supporting hyperplane.

11.3.2 Main result

Before recalling the statement of our main contribution [B26, Theorem 4.15], I introduce now a lastnotation Aunif(·,H) which stands for the set of all parameterized single-valued maps A :R+×H → Hsuch that A is uniformly Lipschitz continuous, that is,

∃M ≥ 0, ∀t ≥ 0, ∀x1, x2 ∈ H, ‖A(t , x2)− A(t , x1)‖ ≤ M‖x2 −x1‖,

and uniformly strongly monotone, that is,

∃α> 0, ∀t ≥ 0, ∀x1, x2 ∈ H, ⟨A(t , x2)− A(t , x1), x2 −x1⟩ ≥α‖x2 −x1‖2.

In particular we proved in [B26, Proposition 3.3] that if A ∈ Aunif(·,H) is semi-differentiable at x ∈H, then Ds A(x) ∈A (H). The next theorem is the major result of the paper [B26].

Theorem 11.24. Let (A, f ) ∈ A (·,H)×Γ0(·,H) and let x : R+ → H be a function. We consider thefunction y :R+ → H defined by

y(t ) := proxA(t ,·), f (t ,·)(x(t )),

for all t ≥ 0. If the following assertions are satisfied:

142


(i) x is differentiable at t = 0;

(ii) A ∈Aunif(·,H);

(iii) A is semi-differentiable at y(0);

(iv) f is twice epi-differentiable at y(0) for v0 := x(0)− A(0, y(0)) ∈ ∂ f (0, y(0));

(v) d2e f (y(0)|v0) ∈ Γ0(H);

then y :R+ → H is differentiable at t = 0 with

y ′(0) = proxDs A(y(0)),d2e f (y(0)|v0)(x ′(0)).

Remark 11.25. Assume that all assumptions of Theorem 11.24 are satisfied. Then, by using the no-tations introduced in Introduction, its conclusion can be rewritten as follows. If y(t ) is the uniquesolution to the variational inequality (VI(A(t , ·), f (t , ·), x(t ))) for all t ≥ 0, then y : R+ → H is differ-entiable at t = 0 and y ′(0) is the unique solution to (VI(Ds A(y(0)), d2

e f (y(0)|v0), x ′(0))).

11.3.3 Comments on the choice of Formula (11.1)

In the whole section, let us consider f ∈ Γ0(·,H), x ∈ dom( f ) and v ∈ ∂ f (0, x). Note that ourmain results in [B26] (summarized in the present chapter) are essentially based on Attouch the-orems (Theorems 11.4 and 11.5) and on Proposition 11.16. One can easily see that these threeresults are totally independent of the definition of ∆2

τ f (x|v), provided that ∆2τ f (x|v) ∈ Γ0(H) and

∂(∆2τ f (x|v)) = ∆τ(∂ f )(x|v). As a consequence it is clear that one could expect to adapt the whole

paper [B26] with a different definition of ∆2τ f (x|v). For example, instead of Formula (11.1), one

could consider

∆2τ f (x|v)(w) := f (τ, x +τw)− f (0, x)−τ⟨v, w⟩

τ2 , (11.2)

or

∆2τ f (x|v)(w) := f (τ, x +τw)− f (τ, x)−τ⟨v(τ), w⟩

τ2 , (11.3)

where v(t ) ∈ ∂ f (t , x) := ∂( f (t , ·))(x) for all t ≥ 0. This is the reason why it is important to jus-tify the choice of Formula (11.1). Actually this choice was natural and has prevailed with re-spect to the initial motivation of [B26] by investigating the case where f is smooth. Indeed as-sume that f is smooth on R+ × H and let y , z : R+ → H be two given functions differentiableat t = 0 which are correlated by the expression y(t ) = prox f (t ,·)(z(t )) for all t ≥ 0. We deduce

that y(t )+∇x f (t , y(t )) = z(t ) for all t ≥ 0 and thus y ′(0)+∇2t x f (0, y(0))+∇2

xx f (0, y(0))(y ′(0)) = z ′(0).Finally we obtain that y ′(0) = proxϕy(0)

(z ′(0)) where ϕx : H →R is defined by

ϕx (w) := 1

2

⟨∇2xx f (0, x)(w), w

⟩+⟨∇2t x f (0, x), w

⟩,

for all x, w ∈ H. Hence it was natural to choose a general expression of ∆2τ f (x|v) that converges

pointwisely on H toϕx (up to an additive constant) whenever f is smooth. Note that Formula (11.1)does (see [B26, Remark 3.12]). In contrast, if f is smooth and if ∆2

τ f (x|v)(w) is defined as in (11.2),then ∆2

τ f (x|v)(w) does not converge if ∇t f (0, x) 6= 0. Similarly, if f is smooth and ∆2τ f (x|v)(w)

is defined as in (11.3), then ∆2τ f (x|v)(w) converges to 1

2 ⟨∇2xx f (0, x)(w), w⟩ that is different (even

up to an additive constant) from ϕx (w) if ∇2t x f (0, x) 6= 0. Finally note that the above comments

also justify the minor change in Formula (11.1) with respect to the Rockafellar’s work [274], pre-cisely, the non-use of the multiplicative constant 1

2 (see Remark 10.4, or Remark 11.9 in the presentchapter).

143


11.4 Applications and perspectives

This section is dedicated to the applications of Theorem 11.24 to parameterized convex minimiza-tion problems extracted from [B26, Section 5], but also to the sensitivity analysis of a parameter-ized scalar Tresca friction problem which involves a perturbed friction threshold. This last resultis actually a recent ongoing work in collaboration with Caubet.

11.4.1 Applications to parameterized convex minimization problems

In [B26, Section 5] we derived from Theorem 11.24 that the derivative of the solution to a param-eterized convex minimization problem is still, under some appropriate assumptions, the solutionto a convex minimization problem. The next proposition, extracted from [B26, Proposition 5.1], isin this sense.

Proposition 11.26. Let f ∈ Γ0(·,H) and ` : R+ → H be given functions. Let g : R+×H → R be suchthat g (t , ·) is differentiable on H with ∇x g (t , ·) ∈A (H) for all t ≥ 0. Then, for all t ≥ 0, the parame-terized convex minimization problem

argminx∈H

[f (t , x)+ g (t , x)−⟨`(t ), x⟩

],

admits a unique solution denoted by y(t ). If moreover the following assumptions are satisfied:

(i) ` is differentiable at t = 0;

(ii) ∇x g ∈Aunif(·,H);

(iii) ∇x g is of class C1 on R+×H;

(iv) f is twice epi-differentiable at y(0) for v0 := `(0)−∇x g (0, y(0)) ∈ ∂ f (0, y(0));

(v) d2e f (y(0), v0) ∈ Γ0(H);

then y :R+ → H is differentiable at t = 0 and y ′(0) is the unique solution to the convex minimizationproblem given by

argminx∈H

[d2

e f (y(0), v0)(x)+ 1

2⟨∇2

xx g (0, y(0))(x), x⟩+⟨∇2t x g (0, y(0))−`′(0), x⟩

].

Note that Proposition 11.26 is illustrated with a one-dimensional example in [B26, Section 5.2].Finally, in [B26, Section 5], we investigated the particular case of parameterized smooth convexminimization problems with inequality constraints and we obtained in [B26, Proposition 5.4] thenext proposition whose proof essentially corresponds to the adaptation of the proof of [278, The-orem 13.14] to the t-dependent framework.

Proposition 11.27. Let m, d ∈ N∗. Let F := (Fi )i=1,...,d : R+×Rm → Rd be such that Fi ∈ Γ0(·,Rm)for every i = 1, . . . ,d. We assume that K(t ) := x ∈ Rm | F (t , x) ∈ Rd− is not empty for all t ≥ 0. Let g :R+×Rm → R be such that g (t , ·) is differentiable on Rm with ∇x g (t , ·) ∈ A (Rm) for all t ≥ 0. Then,for all t ≥ 0, the parameterized convex minimization problem with inequality constraints

argminx∈Rm

F (t ,x)∈Rd−

g (t , x)

admits a unique solution denoted by y(t ). If moreover the following assumptions are satisfied:

(i) ∇x g ∈Aunif(·,Rm);

(ii) ∇x g is of class C1 on R+×Rm ;

144


(iii) F is of class C2 on R+×Rm ;

(iv) ∇t F (0, y(0)) = 0Rd ;

(v) y(0) ∈ K(t ) for all t ≥ 0;

(vi) ‖∇x F (t , y(0))>w‖Rm ≥α for all w ∈ NRd−(F (t , y(0))) with ‖w‖Rd = 1 and all t ≥ 0, for some α>0;

then y : R+ → Rm is differentiable at t = 0 and y ′(0) is the unique solution to the convex minimiza-tion problem with linear inequality/equality constraints given by

argminx∈Rm

x∈K(y(0)|v0)

[1

2⟨∇2

xx g (0, y(0))(x), x⟩Rm +⟨∇2t x g (0, y(0)), x⟩Rm + 1

2max

w∈Y(y(0)|v0)⟨w,D2F (0, y(0))(1, x)⟩Rd

],

where v0 :=−∇x g (0, y(0)), where

K(y(0)|v0) := x ∈Rm | ∇x F (0, y(0))x ∈ TRd (F (0, y(0))) and ⟨x, v0⟩Rm = 0

andY(y(0)|v0) := w ∈Rd | w ∈ NRd (F (0, y(0))) and ∇x F (0, y(0))>w = v0,

where NRd−(F (t , y(0))) and TRd−(F (t , y(0))) stand for the standard normal and tangent cones to Rd−at F (t , y(0)) ∈Rd−.

Remark 11.28. Note that the above Propositions 11.26 and 11.27 provide an interesting perspec-tive concerning the initialization of optimization algorithms. For example one could start an opti-mization algorithm for solving the convex minimization problem of Proposition 11.27 at t = 1 withthe approximation y(0)+ y ′(0) of y(1).

11.4.2 A work in progress

In this last section let us consider the notations and assumptions introduced in Chapters 8 and 10.In a very recent work (still in progress) with Caubet, the contributions of the paper [B26] (summa-rized in the present chapter) reveal to be suitable in order to investigate the sensitivity analysis ofthe parameterized scalar Tresca friction problem given by: find ut ∈ H1(Ω) such that −∆ut +ut = ft in Ω,

|∂nut | ≤ g t and |ut |∂nut +ut g t = 0 on Γ,(11.4)

where ft is a parameterized source term and g t is a parameterized positive friction threshold. Inthat context, the unique solution to the variational formulation of Problem (11.4) is given by ut =proxΦt

(Ft ) where Ft ∈ H1(Ω) is the unique solution to Problem (8.7) and where the parameterizedTresca friction functional Φt ∈ Γ0(H1(Ω)) is defined by

Φt : H1(Ω) −→ R

ϕ 7−→∫Γ

g t |ϕ|.

In that context, following the spirit of the paper [B29] (presented in Chapter 10) with the toolspresented in the present chapter (in particular with the application of Theorem 11.24), we areable to prove that the map t ≥ 0 7→ ut ∈ H1(Ω) is differentiable at t = 0 and its derivative denotedby u′

0 ∈ H1(Ω) is the unique weak solution to the Signorini problem given by

−∆u′0 +u′

0 = f ′0 in Ω,

∂nu′0 − g ′

0∂nu0

g0= 0 on Γu0

N ,

u′0 = 0 on Γu0

D ,

u′0 ≤ 0, ∂nu′

0 − g ′0∂nu0

g0≤ 0 and u′

0(∂nu′0 − g ′

0∂nu0

g0) = 0 on Γu0

S−,

u′0 ≥ 0, ∂nu′

0 − g ′0∂nu0

g0≥ 0 and u′

0(∂nu′0 − g ′

0∂nu0

g0) = 0 on Γu0

S+,

145


where

Γu0N := s ∈ Γ | u0(s) 6= 0, Γ

u0D := s ∈ Γ | u0(s) = 0 and ∂nu0(s) ∈ (−g0(s), g0(s)),

Γu0S− := s ∈ Γ | u0(s) = 0 and ∂nu0(s) = g0(s), Γ

u0S+ := s ∈ Γ | u0(s) = 0 and ∂nu0(s) =−g0(s).

In the case where g t ≡ 1, since g ′0 = 0, we recover the main result of [B29] recalled in Theorem 10.21.

The above ongoing work, which extends the previous paper [B29] to the case of a perturbed fric-tion threshold by using the results obtained in the paper [B26], will be the subject of a forthcomingarticle. It constitutes the next step towards the shape sensitivity analysis of the scalar Tresca fric-tion model, which is the initial motivation presented in Chapter 8 and which is at the origin of allmy collaborations [B26, B27, B28, B29] with Adly and Caubet.

146


[B25] P. Bonnelie, L. Bourdin, F. Caubet and O. Ruatta. Flip procedure in geometric approxima-tion of multiple-component shapes – Application to multiple-inclusion detection. SMAI J.Comput. Math., 2:255–276, 2016. 97, 98, 99, 100, 101, 102, 104, 106, 109

[B26] S. Adly and L. Bourdin. Sensitivity analysis of variational inequalities via twice epi-differentiability and proto-differentiability of the proximity operator. SIAM J. Optim.,28(2):1699–1725, 2018. vi, 109, 112, 121, 133, 135, 137, 138, 139, 140, 141, 142, 143, 144,145, 146

[B27] S. Adly, L. Bourdin, and F. Caubet. On a decomposition formula for the proximal operator ofthe sum of two convex functions. J. Convex Anal., 26(3):699–718, 2019. vi, 109, 111, 113, 114,115, 121, 146

[B28] S. Adly and L. Bourdin. On a decomposition formula for the resolvent operator of the sumof two set-valued maps with monotonicity assumptions. Appl. Math. Optim., 80(3):715–732,2019. vi, 109, 111, 113, 114, 115, 116, 117, 118, 119, 120, 121, 146

[B29] S. Adly, L. Bourdin, and F. Caubet. The derivative of a parameterized mechanical contactproblem with a Tresca’s friction law involves Signorini unilateral conditions. Submitted,2020. 109, 112, 121, 123, 124, 125, 126, 127, 128, 129, 130, 131, 133, 145, 146



[204] S. Adly and R. T. Rockafellar. Sensitivity analysis of monotone inclusions via the proto-differentiability of the resolvent operator. Submitted, 2020. 137

[205] L. Afraites, M. Dambrine, K. Eppler, and D. Kateb. Detecting perfectly insulated obstacles byshape optimization techniques of order two. Discrete Contin. Dyn. Syst. Ser. B, 8(2):389–416,2007. 99, 101

[206] J. M. Aitchison and M. W. Poole. A numerical algorithm for the solution of Signorini prob-lems. J. Comput. Appl. Math., 94(1):55–67, 1998. 131

[207] H. Ammari and H. Kang. Reconstruction of small inhomogeneities from boundary measure-ments, volume 1846 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 2004. 99

[208] J. Andersson. Optimal regularity for the Signorini problem and its free boundary. Invent.Math., 204(1):1–82, 2016. 124

[209] S. S. Antman. The influence of elasticity on analysis: modern developments. Bull. Amer.Math. Soc. (N.S.), 9(3):267–291, 1983. 124

[210] F. J. Aragón Artacho and R. Campoy. Computing the resolvent of the sum of maximallymonotone operators with the averaged alternating modified reflections algorithm. J. Op-tim. Theory Appl., 181(3):709–726, 2019. 119

[211] H. Attouch. Variational convergence for functions and operators. Applicable MathematicsSeries. Pitman (Advanced Publishing Program), Boston, MA, 1984. 126, 136, 138

[212] J.-P. Aubin and H. Frankowska. Set-valued analysis. Modern Birkhäuser Classics. BirkhäuserBoston, Inc., Boston, MA, 2009. Reprint of the 1990 edition [MR1048347]. 115

[213] H. H. Bauschke and P. L. Combettes. Convex analysis and monotone operator theoryin Hilbert spaces. CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC.Springer, Cham, second edition, 2017. With a foreword by Hédy Attouch. 110, 118, 121

[214] E. Bishop. A generalization of the Stone-Weierstrass theorem. Pacific J. Math., 11:777–783,1961. 100

[215] P. Bonnelie. Déformations libres de contours pour l’optimisation de formes et application enélectromagnétisme. PhD thesis, 2017. 97

[216] J. M. Borwein. Fifty years of maximal monotonicity. Optim. Lett., 4(4):473–490, 2010. 115

[217] H. Brézis. Équations et inéquations non linéaires dans les espaces vectoriels en dualité. Ann.Inst. Fourier (Grenoble), 18(1):115–175, 1968. 136, 139

[218] H. Brézis. Problèmes unilatéraux. J. Math. Pures Appl. (9), 51:1–168, 1972. 136


[219] H. Brézis. Opérateurs maximaux monotones et semi-groupes de contractions dans les es-paces de Hilbert. North-Holland Publishing Co., Amsterdam-London; American ElsevierPublishing Co., Inc., New York, 1973. North-Holland Mathematics Studies, No. 5. Notas deMatemática (50). 115

[220] H. Brézis. Functional analysis, Sobolev spaces and partial differential equations. Universi-text. Springer, New York, 2011. 119, 120, 128

[221] M. Burger and S. J. Osher. A survey on level set methods for inverse problems and optimaldesign. European J. Appl. Math., 16(2):263–301, 2005. 99

[222] F. Caubet and M. Dambrine. Localization of small obstacles in Stokes flow. Inverse Problems,28(10):105007, 31, 2012. 99

[223] F. Caubet, M. Dambrine, and D. Kateb. Shape optimization methods for the inverseobstacle problem with generalized impedance boundary conditions. Inverse Problems,29(11):115011, 26, 2013. 106

[224] F. Caubet, M. Dambrine, D. Kateb, and C. Z. Timimoun. A Kohn-Vogelius formulation todetect an obstacle immersed in a fluid. Inverse Probl. Imaging, 7(1):123–157, 2013. 101

[225] F. Chouly. An adaptation of Nitsche’s method to the Tresca friction problem. J. Math. Anal.Appl., 411(1):329–339, 2014. 110

[226] P. L. Combettes, D. Dung, and B. C. Vu. Dualization of signal recovery problems. Set-ValuedVar. Anal., 18(3-4):373–404, 2010. 118

[227] P. L. Combettes and J.-C. Pesquet. Proximal splitting methods in signal processing. In Fixed-point algorithms for inverse problems in science and engineering, volume 49 of Springer Op-tim. Appl., pages 185–212. Springer, New York, 2011. 121

[228] B. Desmorat. Structural rigidity optimization with frictionless unilateral contact. Internat.J. Solids Structures, 44(3-4):1132–1144, 2007. 109

[229] C. N. Do. Generalized second-order derivatives of convex functions in reflexive Banachspaces. Trans. Amer. Math. Soc., 334(1):281–301, 1992. 127, 136, 137, 139, 140, 141

[230] J. Douglas and H. H. Rachford. On the numerical solution of heat conduction problems intwo and three space variables. Trans. Amer. Math. Soc., 82:421–439, 1956. 111, 114

[231] G. Duvaut and J.-L. Lions. Inequalities in mechanics and physics. Springer-Verlag, Berlin-New York, 1976. Translated from the French by C. W. John, Grundlehren der Mathematis-chen Wissenschaften, 219. 109, 124

[232] C. Ericson. Real-Time Collision Detection. Morgan Kaufmann Publishers Inc., San Francisco,CA, USA, 2004. The Morgan Kaufmann Series in Interactive 3-D Technology. 102

[233] L. C. Evans. Partial differential equations, volume 19 of Graduate Studies in Mathematics.American Mathematical Society, Providence, RI, second edition, 2010. 119, 128

[234] G. Farin. Curves and Surfaces for CAGD: A Practical Guide. Morgan Kaufmann PublishersInc., San Francisco, CA, USA, 2002. 5th edition. 99

[235] G. Fichera. Problemi elastostatici con vincoli unilaterali: Il problema di Signorini con am-bigue condizioni al contorno. Atti Accad. Naz. Lincei Mem. Cl. Sci. Fis. Mat. Natur. Sez. I (8),7:91–140, 1963/1964. 124, 136

[236] S. Fitzpatrick and R. R. Phelps. Differentiability of the metric projection in Hilbert space.Trans. Amer. Math. Soc., 270(2):483–501, 1982. 136


[237] A. Haraux. How to differentiate the projection on a convex set in Hilbert space. Some appli-cations to variational inequalities. J. Math. Soc. Japan, 29(4):615–631, 1977. 121, 133, 136

[238] J. Haslinger and R. A. E. Mäkinen. Introduction to shape optimization, volume 7 of Advancesin Design and Control. Society for Industrial and Applied Mathematics (SIAM), Philadelphia,PA, 2003. Theory, approximation, and computation. 109

[239] F. Hecht. Finite element library FREEFEM++. http://www.freefem.org/ff++/. 105, 114,120, 131

[240] A. Henrot and M. Pierre. Variation et optimisation de formes, volume 48 of Mathématiques& Applications (Berlin) [Mathematics & Applications]. Springer, Berlin, 2005. Une analysegéométrique. [A geometric analysis]. 98, 102, 104, 105

[241] J.-B. Hiriart-Urruty. Unsolved problems: at what points is the projection mapping differen-tiable? Amer. Math. Monthly, 89(7):456–458, 1982. 136

[242] J.-B. Hiriart-Urruty and C. Lemaréchal. Fundamentals of convex analysis. Grundlehren TextEditions. Springer-Verlag, Berlin, 2001. Abridged version of ıt Convex analysis and min-imization algorithms. I [Springer, Berlin, 1993; MR1261420 (95m:90001)] and ıt II [ibid.;MR1295240 (95m:90002)]. 110, 138

[243] V. Isakov. Inverse problems for partial differential equations, volume 127 of Applied Mathe-matical Sciences. Springer, New York, second edition, 2006. 103

[244] T. Iwai, A. Sugimoto, T. Aoyama, and H. Azegami. Shape optimization problem of elasticbodies for controlling contact pressure. JSIAM Lett., 2:1–4, 2010. 109

[245] S. Kichenassamy, A. Kumar, P. Olver, A. Tannenbaum, and A. Yezzi. Gradient flows and ge-ometric active contour models. Proceedings of the Fifth International Conference on Com-puter Vision (ICCV), 1995. 98

[246] N. Kikuchi and J. T. Oden. Contact problems in elasticity: a study of variational inequalitiesand finite element methods, volume 8 of SIAM Studies in Applied Mathematics. Society forIndustrial and Applied Mathematics (SIAM), Philadelphia, PA, 1988. 124

[247] N. Kim, K. Choi, J. Chen, and Y. Park. Meshless shape design sensitivity analysis and op-timization for contact problem with friction. Computational Mechanics, 25(2-3):157–168,2000. 109

[248] D. Kinderlehrer. Remarks about Signorini’s problem in linear elasticity. Ann. Scuola Norm.Sup. Pisa Cl. Sci. (4), 8(4):605–645, 1981. 124

[249] J.-L. Lions. Quelques méthodes de résolution des problèmes aux limites non linéaires. Dunod;Gauthier-Villars, Paris, 1969. 136

[250] J.-L. Lions and G. Stampacchia. Variational inequalities. Comm. Pure Appl. Math., 20:493–519, 1967. 136

[251] P.-L. Lions and B. Mercier. Splitting algorithms for the sum of two nonlinear operators. SIAMJ. Numer. Anal., 16(6):964–979, 1979. 111, 114, 117, 119

[252] A. Marco and J.-J. Martínez. A fast and accurate algorithm for solving Bernstein-Vandermonde linear systems. Linear Algebra Appl., 422(2-3):616–628, 2007. 100

[253] B. Martinet. Détermination approchée d’un point fixe d’une application pseudo-contractante. Cas de l’application prox. C. R. Acad. Sci. Paris Sér. A-B, 274:A163–A165, 1972.110, 114

http://www.freefem.org/ff++/


[254] A. Maury, G. Allaire, and F. Jouve. Shape optimisation with the level set method for contactproblems in linearised elasticity. SMAI J. Comput. Math., 3:249–292, 2017. 109

[255] F. Mignot. Contrôle dans les inéquations variationelles elliptiques. J. Functional Analysis,22(2):130–185, 1976. 121, 136

[256] G. J. Minty. Monotone (nonlinear) operators in Hilbert space. Duke Math. J., 29:341–346,1962. 116

[257] G. J. Minty. On some aspects of the theory of monotone operators. In Theory and Applica-tions of Monotone Operators (Proc. NATO Advanced Study Inst., Venice, 1968), pages 67–82.Edizioni “Oderisi”, Gubbio, 1969. 115

[258] J.-J. Moreau. Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. France, 93:273–299, 1965. 110, 114, 116

[259] U. Mosco. Convergence of convex sets and of solutions of variational inequalities. Advancesin Math., 3:510–585, 1969. 136

[260] U. Mosco. Implicit variational problems and quasi variational inequalities. In Nonlinear op-erators and the calculus of variations (Summer School, Univ. Libre Bruxelles, Brussels, 1975),pages 83–156. Lecture Notes in Math., Vol. 543. 1976. 136

[261] D. Noll. Second order differentiability of integral functionals on Sobolev spaces and L2-spaces. J. Reine Angew. Math., 436:1–17, 1993. 130

[262] J. O’Rourke. Computational geometry in C. Cambridge University Press, Cambridge, secondedition, 1998. 102

[263] J.-P. Penot. Differentiability of relations and differential stability of perturbed optimizationproblems. SIAM J. Control Optim., 22(4):529–551, 1984. 137, 139, 140

[264] J.-P. Penot. Calculus without derivatives, volume 266 of Graduate Texts in Mathematics.Springer, New York, 2013. 138

[265] J.-P. Penot. Analysis. From concepts to applications. Universitext. Springer, 2016. 139

[266] T. D. M. Phan. 3D Free Form Method and Applications to Robotics. PhD thesis, University ofLimoges (France), 2014. 106

[267] R. R. Phelps. Convex functions, monotone operators and differentiability, volume 1364 ofLecture Notes in Mathematics. Springer-Verlag, Berlin, second edition, 1993. 115

[268] R. Poliquin and R. T. Rockafellar. Second-order nonsmooth analysis in nonlinear program-ming. In Recent advances in nonsmooth optimization, pages 322–349. World Sci. Publ., RiverEdge, NJ, 1995. 137

[269] R. A. Poliquin and R. T. Rockafellar. A calculus of epi-derivatives applicable to optimization.Canad. J. Math., 45(4):879–896, 1993. 137

[270] I. Páczelt and T. Szabó. Optimal shape design for contact problems. 7(1–2):66–75, 1994. 109

[271] R. T. Rockafellar. Convex analysis. Princeton Mathematical Series, No. 28. Princeton Univer-sity Press, Princeton, N.J., 1970. 115

[272] R. T. Rockafellar. On the maximal monotonicity of subdifferential mappings. Pacific J. Math.,33:209–216, 1970. 114, 116

[273] R. T. Rockafellar. Monotone operators and the proximal point algorithm. SIAM J. ControlOptim., 14(5):877–898, 1976. 110, 114


[274] R. T. Rockafellar. Maximal monotone relations and the second derivatives of nonsmoothfunctions. Ann. Inst. H. Poincaré Anal. Non Linéaire, 2(3):167–184, 1985. 112, 121, 125, 127,136, 139, 140, 143

[275] R. T. Rockafellar. Proto-differentiability of set-valued mappings and its applications in opti-mization. Ann. Inst. H. Poincaré Anal. Non Linéaire, 6:449–482, 1989. Analyse non linéaire(Perpignan, 1987). 136, 139, 140

[276] R. T. Rockafellar. Generalized second derivatives of convex functions and saddle functions.Trans. Amer. Math. Soc., 322(1):51–77, 1990. 112, 121, 127, 136, 139, 140, 141

[277] R. T. Rockafellar. Second-order convex analysis. J. Nonlinear Convex Anal., 1(1):1–16, 2000.141

[278] R. T. Rockafellar and R. J.-B. Wets. Variational analysis, volume 317 of Grundlehren der Math-ematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1998. 112, 121, 126, 127, 136, 138, 139, 140, 141, 144

[279] A. Schumacher. Topologieoptimisierung von Bauteilstrukturen unter Verwendung vonLopchpositionierungkrieterien. PhD thesis, Universität-Gesamthochschule-Siegen (Ger-many), 1995. 99

[280] R. Schumann. Regularity for Signorini’s problem in linear elasticity. Manuscripta Math.,63(3):255–291, 1989. 124

[281] T. W. Sederberg. Computer aided geometric design. 2014. Course notes. 99

[282] J. A. Sethian. Level set methods and fast marching methods, volume 3 of Cambridge Mono-graphs on Applied and Computational Mathematics. Cambridge University Press, Cam-bridge, second edition, 1999. Evolving interfaces in computational geometry, fluid mechan-ics, computer vision, and materials science. 99

[283] A. Shapiro. Directionally nondifferentiable metric projection. J. Optim. Theory Appl.,81(1):203–204, 1994. 136

[284] A. Shapiro. Differentiability properties of metric projections onto convex sets. J. Optim.Theory Appl., 169(3):953–964, 2016. 136

[285] M. Shillor, M. Sofonea, and J. Telega. Models and analysis of quasistatic: variational meth-ods. Springer-Verlag, 2004. 124

[286] A. Signorini. Sopra alcune questioni di elastostatica. 1933. 124, 136

[287] A. Signorini. Questioni di elasticità non linearizzata e semilinearizzata. Rend. Mat. e Appl.(5), 18:95–139, 1959. 124, 136

[288] S. Simons and C. Zalinescu. A new proof for Rockafellar’s characterization of maximalmonotone operators. Proc. Amer. Math. Soc., 132(10):2969–2972, 2004. 116

[289] J. Sokołowski and A. Zochowski. On the topological derivative in shape optimization. SIAMJ. Control Optim., 37(4):1251–1272, 1999. 99

[290] J. Sokołowski and J.-P. Zolésio. Introduction to shape optimization, volume 16 of SpringerSeries in Computational Mathematics. Springer-Verlag, Berlin, 1992. Shape sensitivity anal-ysis. 102, 109

[291] G. Stampacchia. Formes bilinéaires coercives sur les ensembles convexes. C. R. Acad. Sci.Paris, 258, 1964. 136


[292] G. Stampacchia. Variational inequalities. In Theory and Applications of Monotone Operators(Proc. NATO Advanced Study Inst., Venice, 1968), pages 101–192, 1969. 136

[293] N. Strömberg and A. Klarbring. Topology optimization of structures in unilateral contact.Struct. Multidiscip. Optim., 41(1):57–64, 2010. 109

[294] Y. Yu. On decomposing the proximal map. Advances in Neural Information Processing Sys-tems, 2013. 111, 117

[295] E. H. Zarantonello. Projections on convex sets in Hilbert space and spectral theory. I. Pro-jections on convex sets. pages 237–341, 1971. 136


Documents

Mémoire HABILITATION À DIRIGER DES RECHERCHES