RECONNAISSANCE DE GESTEScasiez/NIHM/ReconnaissanceDeGestes.pdf · ¤ Traitement des points pour éliminer les points trop proches: tout point dont la distance est inférieure à 3

RECONNAISSANCE DE GESTESGéry Casiez http://cristal.univ-lille.fr/~casiez/ M2 IVI - NIHM

1

Minority report2

Iron man3

Plan

¨ Objectifs ¨ Technique libstroke ¨ Technique $1 ¨ Rubine ¨ DTW

4

Objectifs5

¨ Reconnaître des gestes réalisés par l’utilisateur pour les associer à des commandes

¨ On se limite aux gestes un seul trait (unistroke)

Unistroke - Graffiti (Palm OS)

Multistroke

Technique de reconnaissance de gestes élémentaires: libstroke (1997)

6

¨ Librairie disponible dans FVWM http://etla.net/libstroke/ ¨ Les gestes sont décrits par des séquences de

chiffres

http://etla.net/libstroke/

Libstroke: algorithme7

¨ 1) détermination de la boite englobante (bounding box) à partir de minx, maxx,miny, maxy

¨ 2) division de la boite en 9 cellules ¨ 3) étiquetage des points ¨ 4) factorisation ¨ 5) comparaison aux motifs enregistrés

¨ Extension Firefox: Mouse Gestures Redox

Techniques de reconnaissance8

¨ Classifieurs statistiques (Rubine, DTW) ¨ Modèles de Markov cachés ¨ Réseaux de neurones ¨ Méthodes ad-hoc (libstroke, $1)

Techniques de reconnaissance9

¨ $1 recognizer ¨ Méthode simple à implémenter et donnant des taux

de reconnaissance comparables à Rubine et DTW ¨ 1 seul exemple suffit ¨ Technique publiée dans l’article de recherche

"Gestures without Librairies, Toolkits or Training: A $1 Recognizer for User Interface Prototypes" à la conférence User Interface Sofware and Technology (UIST) en 2007 (écrit par Wobbrock, Wilson et Li)

$1 recognizer10

Algorithme11

¨ 1) L’utilisateur réalise un geste ¤ Le geste est représenté par une liste ordonnée de points

¨ 2) Ce geste est comparé à un ensemble de gestes de référence (templates) en utilisant une mesure de distance euclidienne

¨ 3) Le geste reconnu est celui pour lequel cette distance est minimale

Problèmes posés12

¨ Le nombre de points pour définir un geste va dépendre de la vitesse d’exécution, de la fréquence d’échantillonnage du périphérique…

Problèmes posés13

¨ Un geste peut être réalisé à différentes positions sur l’interface,

¨ Suivant différentes orientations

¨ Et différentes échelles

4 étapes14

¨ 1) Ré-échantillonnage du geste pour être invariant à la fréquence d’acquisition et la vitesse d’exécution du geste

¨ 2) Ré-orientation du geste pour être invariant à l’orientation suivant laquelle il est exécuté

¨ 3) Mise à l’échelle et translation pour être invariant à l’échelle de réalisation du geste et la position à laquelle il est exécuté

¨ 4) Reconnaissance du geste

1ère étape: Ré-échantillonnage15

¨ Le geste est défini par M points ordonnés ¨ On veut N points ordonnés équidistants les uns des

autres ¨ N = 64

2e étape: Rotation « indicative » 16

¨ 1) Calcul du centre du geste (centroïde) ¨ 2) Calcul de l’angle entre le centroïde, le premier

point et l’horizontale ¨ 3) Rotation des points en utilisant cet angle

3e étape: mise à l’échelle et translation17

¨ Mise à l’échelle non uniforme: on ramène le geste à un carré de référence

¨ 1) détermination de la boîte englobante du geste (bounding box) ¤ Détermination de minx, maxx, miny, maxy

¨ 2) Mise à l’échelle ¨ 3) Translation à l’origine

4e étape: reconnaissance18

¨ Un geste candidat C est comparé à chacun des gestes enregistrés (templates) Ti pour calculer la distance moyenne di entre les points correspondants

¨ Le template avec la distance moyenne la plus faible est le résultat de la reconnaissance

¨ La distance est tranformée en score entre 0 et 1

Limitations19

¨ Pas possible de distinguer un carré d’un rectangle, une ellipse d’un cercle, une flèche vers le bas d’une flèche vers le haut

¨ Pas possible de reconnaître des gestes « 1D »

Classifieur statistique

¨ Rubine ¨ Reconnaissance de gestes ne comportant qu’un seul

tracé (unistroke) ¨ Traitement statistique de caractéristiques

20

Rubine

¨ Liste de points en entrée ¤ Traitement des points pour éliminer les points trop

proches: tout point dont la distance est inférieure à 3 pixels du point précédent est supprimé

¤ Calcul d’un vecteur de caractéristiques statistiques ¤ Comparaison aux différents gestes de référence. Le

geste avec le score le plus grand est renvoyé. ¤ Détection des gestes ambigus

21

Caractéristiques22

Initialisation

¨ Calcul des statistiques moyennes pour chaque caratéristique de chaque geste

¨ Calcul de la matrice de covariance moyenne de tous les gestes

¨ Objectif: trouver les coefficients pondérateurs des caractéristiques statistiques qui permettent de maximiser le score des gestes de chaque classe

23

Reconnaissance

¨ Calcul des caractéristiques du geste candidat ¨ Calcul d’une probabilité de correspondance pour

chaque geste de référence ¨ Le geste avec la probabilité la plus importante est

choisi

24

Dynamic Time Warping

¨ DTW ¨ Déformation temporelle dynamique ¨ Déterminer pour chaque élément d’une séquence, le

meilleur élément correspondant dans l’autre séquence relativement à un certain voisinage et à une certaine métrique

¨ Complexité polynomiale

25

Applications

¨ Vidéo, audio, graphique... ¨ Toutes données qui peuvent être transformées en

représentation linéaire en fonction du temps (séries temporelles)

¨ Echantillons ordonnés par une étiquette de temps ¨ Reconnaissance vocale ¨ Reconnaissance de gestes off-line et on-line ¨ alignement de protéines...

26

Principe de base

¨ Séquence de référence R = [r1, r2, ..., rn] ¨ Séquence de test T = [t1, t2, ..., tm] ¨ Si m=n alors on peut calculer la distance entre les deux

signaux de la façon suivante (pas forcément idéal):

¨ Possibilité de calculer la distance euclidienne ou d’utiliser une autre métrique (fonction qui donne un réel)

¨ Plus possible d’utiliser cette méthode dès que n ≠ m ¨ Attention: les échantillons des séquences doivent être

équidistants en temps

27

D =n�

i=1

distance(ri, ti) (1)

1

Principe de base

¨ DTW réalise d’abord un alignement non linéaire en recherchant parmi tous les alignements possibles, celui qui minimise une fonction de coût cumulé

¨ «Time Warping»: Dilation ou compression des séquences pour obtenir le meilleur alignement possible

28

Principe de base

¨ Détermination du chemin W = [w1, w2, ..., wk] de longueur minimale

29 D =n�

i=1

distance(ri, ti) (1)

k�

i=1

distance(wi) (2)

1

Principe de base

¨ Conditions aux frontières ¤ w1 = (r1, t1) ¤ wk = (rn, tm)

¨ Contraintes locales ¤ Monotonicité pour respecter le séquencement des points ¤ Eviter les sauts dans le temps ¤ Pour tout couple (ri, tj), le choix des prédécesseurs est limité à

n (ri-1,tj), (ri,tj-1), (ri-1,tj-1)

¨ Exhaustivité ¤ Chaque élément de R doit être mis en relation avec au moins

un élément de T et vice-versa ¤ max(m, n) ≤ k ≤ m + n - 1

30

Principe de base31

Principe de base

¨ Programmation dynamique ¨ Fonction d’optimisation:

¤ Soit D(i,j) la longueur du chemin entre (r1, t1) et (ri, tj) ¨ Récursion:

¤ D(i,j) = dist(ri,tj) + min(D(i-1,j), D(i-1,j-1), D(i, j-1)) ¤ condition initiale: D(1,1) = dist(r0,t0)

¨ Distance minimale entre les deux séquences ¤ D(n,m)

32

Mise en application

¨ Construction d’une matrice D de dimensions n x m dans laquelle chaque élément (i,j) contient D(i,j)

¨ Remplissage de D(1,1) avec la condition initiale ¨ Utilisation de la formule récursive pour remplir la

matrice ligne par ligne ou colonne par colonne ¨ Cas particuliers première ligne et première colonne ¨ Distance minimale donnée par l’élément D(n,m) ¨ La distance minimale peut être normalisée par la

longueur du chemin

33

Algorithme34

i

i

00 00

0 0 0

Chemin de déformation35

Chemin de déformation

¨ A chaque calcul de D(i,j), sauvegarde du prédécesseur qui minimise la distance

¨ Parcours des prédécesseurs en partant de D(n,m)

36

Complexité

¨ Complexité: O(m*n) ¨ Optimisation en limitant la région de recherche

37

Optimisation

¨ La zone supérieure gauche et la zone inférieure droite ne sont pas calculées

¨ Les distances locales associées sont mises à une valeur très élevée afin que le chemin n’y passe pas

38

Optimisation39

Optimisation

¨ Mise à l’échelle ¤ Réduire les tailles de R et T

40

Singularités

¨ Une singularité apparaît quand un point d’une séquence est associé à de nombreux points de l’autre séquence sans raison valable

41

2

Although DTW has been successfully used in many domains, it can produce

pathological results. The crucial observation is that the algorithm may try to explain

variability in the Y-axis by warping the X-axis. This can lead to unintuitive alignments

where a single point on one time series maps onto a large subsection of another time

series. We call examples of this undesirable behavior "singularities". A variety of ad-hoc

measures have been proposed to deal with singularities. All of these approaches

essentially constrain the possible warpings allowed. However they suffer from the

drawback that they may prevent the "correct" warping from being found.

In simulated cases, the correct warping can be known by warping a time series and

attempting to recover the original (see Section 4). In naturally occurring cases we take

"correct" to mean intuitively obvious "feature to feature" alignment as in Figure 2.B.

An additional problem with DTW is that the algorithm may fail to find obvious,

natural alignments in two sequences simply because a feature (i.e peak, valley, inflection

point, plateau etc.) in one sequence is slightly higher or lower than its corresponding

feature in the other sequence. Figure 2 illustrates this problem.

Figure 2 : A) Two synthetic signals (with the same mean and variance). B) The natural "feature to

feature" alignment. C) The alignment produced by dynamic time warping. Note that DTW failed to align

the two central peaks because they are slightly separated in the Y-axis

In this paper we address both these problems by introducing a modification of DTW.

The crucial difference is in the features we consider when attempting to find the correct

warping. Rather than use the raw data, we consider only the (estimated) local derivatives

of the data.

The rest of the paper is organized as follows. Section 2 contains a review of the classic

DTW algorithm, including the various techniques suggested to prevent singularities. In

Section 3 we introduce and demonstrate our extension, which we call Derivative

Dynamic Time Warping (DDTW). Section 4 contains experimental results, and in

Section 5 we offer conclusions and discuss possible directions for future work.

0 5 10 15 20 25 30 0 5 10 15 20 25 300 5 10 15 20 25 30

A B C

Singularités: technique de fenêtrage

¨ Ajout de contraintes dans les alignements possibles ¨ Technique de fenêtrage

¤ Donne une borne supérieure à la singularité

42

for j=max(1,i-w); j <= min (m,i+w); j++ doi

Singularités: slope weighting

¨ Pondérer les déplacements suivant les directions: ¤ D(i,j) = dist(ri,tj) + min(X*D(i-1,j), D(i-1,j-1), X*D(i, j-1)) ¤ X>1

¨ Force l’alignement suivant la diagonale

43

Singularités: Step patterns

¨ D(i,j) = dist(ri,tj) + min(D(i-1,j-2), D(i-1,j-1), D(i-2, j-1)) ¨ Force le déplacement suivant la diagonale

44

5

We could replace equation 5 with !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j-2) , !(i-

2,j-1) }, which corresponds with the step-pattern show in Figure 4.B. Using this

equation the warping path is forced to move one diagonal step for each step

parallel to an axis. Dozens of different step-patterns have been considered,

Rabiner and Juang (1993) contains a review.

All the above may help to mitigate the problem of singularities, but at the risk of

missing the correct warping. Additionally, it is not obvious how to chose the various

parameters (R for Windowing and X for Slope Weighting) or Step-Pattern.

3 Derivative dynamic time warping

If DTW attempts to align two sequences that are similar except for local accelerations

and decelerations in the time axis, the algorithm is likely to be successful. The algorithm

has problems when the two sequences also differ in the Y-axis. Global differences,

affecting the entire sequences, such as different means (offset translation), different

scalings (amplitude scaling) or linear trends can be efficiently removed (Keogh and

Pazzani 1998, Agrawal et. al. 1995). However the two series may also have local

differences in the Y-axis, for example a valley in one sequence may be deeper that the

corresponding valley in the other time series. Consider Figure 5 as an example. Two

identical sequences will clearly produce a one to one alignment. But if we slightly

change a local feature, in this case the depth of a valley, DTW attempts to explain the

difference in terms of the time-axis and produces two singularities.

Figure 5: Using DTW, two identical sequences (a) will clearly produce a one to one alignment (b).

However, if we slightly change a local feature, in this case the depth of a valley (c), DTW attempts to

explain the difference in terms of the time-axis and produces two singularities (d).

B A

Figure 4: A pictorial representation of two alternative step-patterns:

A) The pattern corresponding to !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j ) , !(i,j-1) }

B) The pattern corresponding to !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j-2) , !(i-2,j-1) }

a b dc

Singularités

¨ Problème: ces contraintes peuvent empêcher d’obtenir l’alignement correct

¨ Comment définir les différents paramètres des techniques de levée des singularités (taille de fenêtre, coefficients de pondération, motifs)?

45

DDTW: Derivative Dynamic Time Warping46

5

We could replace equation 5 with !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j-2) , !(i-

2,j-1) }, which corresponds with the step-pattern show in Figure 4.B. Using this

equation the warping path is forced to move one diagonal step for each step

parallel to an axis. Dozens of different step-patterns have been considered,

Rabiner and Juang (1993) contains a review.

All the above may help to mitigate the problem of singularities, but at the risk of

missing the correct warping. Additionally, it is not obvious how to chose the various

parameters (R for Windowing and X for Slope Weighting) or Step-Pattern.

3 Derivative dynamic time warping

If DTW attempts to align two sequences that are similar except for local accelerations

and decelerations in the time axis, the algorithm is likely to be successful. The algorithm

has problems when the two sequences also differ in the Y-axis. Global differences,

affecting the entire sequences, such as different means (offset translation), different

scalings (amplitude scaling) or linear trends can be efficiently removed (Keogh and

Pazzani 1998, Agrawal et. al. 1995). However the two series may also have local

differences in the Y-axis, for example a valley in one sequence may be deeper that the

corresponding valley in the other time series. Consider Figure 5 as an example. Two

identical sequences will clearly produce a one to one alignment. But if we slightly

change a local feature, in this case the depth of a valley, DTW attempts to explain the

difference in terms of the time-axis and produces two singularities.

Figure 5: Using DTW, two identical sequences (a) will clearly produce a one to one alignment (b).

However, if we slightly change a local feature, in this case the depth of a valley (c), DTW attempts to

explain the difference in terms of the time-axis and produces two singularities (d).

B A

Figure 4: A pictorial representation of two alternative step-patterns:

A) The pattern corresponding to !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j ) , !(i,j-1) }

B) The pattern corresponding to !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j-2) , !(i-2,j-1) }

a b dc

@INPROCEEDINGS{Keogh01derivativedynamic, author = {Eamonn J. Keogh and Michael J. Pazzani}, title = {Derivative Dynamic Time Warping}, booktitle = {In First SIAM International Conference on Data Mining (SDM’2001}, year = {2001}}

DDTW: Derivative Dynamic Time Warping

¨ Utilisation de la dérivée des données des séquences

¨ Calcul du carré de la différence des dérivées

47

6

The weakness of DTW is in the features it considers. It only considers a datapoints Y-

axis value. For example consider two datapoints qi and c

j which have identical values, but

qi

is part of a rising trend and cj is part of a falling trend. DTW considers a mapping

between these two points ideal, although intuitively we would prefer not to map a rising

trend to a falling trend. To prevent this problem we propose a modification of DTW that

does not consider the Y-values of the datapoints, but rather considers the higher level

feature of "shape". We obtain information about shape by considering the first derivative

of the sequences, and thus call our algorithm Derivative Dynamic Time Warping

(DDTW).

3.1 Algorithm details

As before we construct an n-by-m matrix where the (ith, j

th) element of the matrix

contains the distance d(qi,c

j) between the two points q

i and c

j. With DDTW the distance

measure d(qi,c

j) is not Euclidean but rather the square of the difference of the estimated

derivatives of qi and c

j. While there exist sophisticated methods for estimating

derivatives, particularly if one knows something about the underlying model generating

the data, we use the following estimate for simplicity and generality.

1 < i < m (6)

This estimate is simply the average of the slope of the line through the point in

question and its left neighbor, and the slope of the line through the left neighbor and the

right neighbor. Empirically this estimate is more robust to outliers than any estimate

considering only two datapoints. Note the estimate is not defined for the first and last

elements of the sequence. Instead we use the estimates of the second and penultimate

elements respectively. For noisy datasets we use exponential smoothing (Mills 1990)

before attempting to estimate the derivatives.

DDTW’s time complexity is O(mn), which is the same as DTW. There are some added

constant factors because of derivative estimating step, but there is no need to remove

offset translation, a necessary step for DTW. Empirically the two algorithms take

approximately the same time. A variety of optimizations have been proposed for DTW

(Myers et. al 1990). We omit discussion of them here for brevity, but note that they apply

equally well to DDTW.

Figure 6: 1) Two artificial signals. 2) The intuitive feature to feature warping alignment. 3) The

alignment produced by classic DTW. 4) The alignment produced by DDTW.

2

)2)(()(][ 111 !+! !+!= iiii

x

qqqqqD

0 5 10 15 20 25 30

4

1 2

3

6

The weakness of DTW is in the features it considers. It only considers a datapoints Y-

axis value. For example consider two datapoints qi and c

j which have identical values, but

qi

is part of a rising trend and cj is part of a falling trend. DTW considers a mapping

between these two points ideal, although intuitively we would prefer not to map a rising

trend to a falling trend. To prevent this problem we propose a modification of DTW that

does not consider the Y-values of the datapoints, but rather considers the higher level

feature of "shape". We obtain information about shape by considering the first derivative

of the sequences, and thus call our algorithm Derivative Dynamic Time Warping

(DDTW).

3.1 Algorithm details

As before we construct an n-by-m matrix where the (ith, j

th) element of the matrix

contains the distance d(qi,c

j) between the two points q

i and c

j. With DDTW the distance

measure d(qi,c

j) is not Euclidean but rather the square of the difference of the estimated

derivatives of qi and c

j. While there exist sophisticated methods for estimating

derivatives, particularly if one knows something about the underlying model generating

the data, we use the following estimate for simplicity and generality.

1 < i < m (6)

This estimate is simply the average of the slope of the line through the point in

question and its left neighbor, and the slope of the line through the left neighbor and the

right neighbor. Empirically this estimate is more robust to outliers than any estimate

considering only two datapoints. Note the estimate is not defined for the first and last

elements of the sequence. Instead we use the estimates of the second and penultimate

elements respectively. For noisy datasets we use exponential smoothing (Mills 1990)

before attempting to estimate the derivatives.

DDTW’s time complexity is O(mn), which is the same as DTW. There are some added

constant factors because of derivative estimating step, but there is no need to remove

offset translation, a necessary step for DTW. Empirically the two algorithms take

approximately the same time. A variety of optimizations have been proposed for DTW

(Myers et. al 1990). We omit discussion of them here for brevity, but note that they apply

equally well to DDTW.

Figure 6: 1) Two artificial signals. 2) The intuitive feature to feature warping alignment. 3) The

alignment produced by classic DTW. 4) The alignment produced by DDTW.

2

)2)(()(][ 111 !+! !+!= iiii

x

qqqqqD

0 5 10 15 20 25 30

4

1 2

3

DDTW: Derivative Dynamic Time Warping48

8

These results confirm what a casual perusal of Figure 7 tells us. DTW attempts to

correct minor differences between the sequences by wild warpings of the time axis. In

contrast DDTW, which considers higher levels features is much less inclined to "find" a

warping were none exists.

Figure 7: Examples of the three experimental datasets used in this paper. Each box in the leftmost

column contains two related sequences that do not contain time warping, but do have minor differences

in the Y-axis. The warpings "discovered" by the two algorithms discussed in this paper are therefore

completely spurious. The two rightmost columns show examples of the warpings returned by both

algorithms. Table 1 contains a numerical comparison.

4.2 Finding the correct warping

To test the ability of an algorithm to discover the correct warping between two

sequences we need sequences for which the correct warping is known. Our approach is to

take a sequence Q and to make a copy of it. This copy, Q’ then has a warping randomly

inserted into it. We can then use Q and Q’ as input into the two algorithms and compare

warpings.

To distort sequence Q’ we begin by randomly choosing an anchor point, somewhere on

the sequence. We first distort the Y-axis by adding or subtracting a Gaussian bump,

centered on the anchor point, to the sequence. We did this for 3 different height bumps

0.0, 0.1 and 0.2 (0.0 corresponding to no bump). Next, we randomly shifted the anchor

point 15 time-units left or right. The other datapoints were moved to compensate for this

shift by an amount that depended on their inverse squared distance to the anchor point,

thus localizing the effect. After the X-axis transformation we interpolated the data back

onto the original, equi-spaced X-axis. The net effect of the two transformations is a

smooth local distortion of the original sequence, as shown in Figure 8.

Space Shuttle

Exchange rate

EEG

DTW DDTW

Recherche de motifs49

Reconnaissance de gestes

¨ Pré-traitement des gestes de référence et du geste réalisé par l’utilisateur ¤ mise à l’échelle, changement origine, rotation…

¨ Alignement avec chacun des gestes ¨ Le geste de référence pour lequel la distance entre

les deux séquences est la plus faible est considéré comme reconnu

50

Affine invariant DTW

¨ Problème: identifier dans une séquence un objet qui subit une transformation affine

¨ Choix de la fonction de coût?

51

!!"#"$"$$ !"#$%!&'"% !! ""#$ ()

!!"#"$"## !&'"%!"#$% !! "%#$ ()))))))))))%*+!)

,-./.) &&##

##%

&

&'""

%

&

&'""#

%#$

%$

*

!%(

*

!%(

*(

*()

!0!%%!!%%1!%((

*

!%(( "&'"!&!

%

&

"&'"!&!####$$$$( """"""# &

#

()

!0!%%!!%%1!%((

*

!%(( "&'"!&!

%

&

"&'"!&!##$$$$##) ""%"""# &

#

()

0!%!%1 +

(

+

(

*

!&!!&!

%

&

##$$* "%"# &#

2))))))))))))))))))))))))))))))))

)

4. Experiments on Online Rotated

Handwriting Recognition

)3#45/.)+)6789:;.")'<)='>8>.?)@.">#$4)A89:;.")

)

@') .789#$.) >-.) 5>#;#>B) '<) CDEF@G() ,.) -8H.)

.7.&5>.?) /.&'4$#>#'$) .7:./#9.$>") '$) 8) ".>) '<) /'>8>.?)

'$;#$.)-8$?,/#>#$4)?#4#>)?8>8)</'9)>-.)IJD)=.:'"#>'/B)

'<) 98&-#$.) ;.8/$#$4) ?8>8K8".") L*MN2) @-.) ?8>8K8".)

#$&;5?.") O(PQP) >/8#$#$4) "89:;.") 8$?) R(PQS) ,/#>./)

#$?.:.$?.$>) >.">#$4) "89:;."2) D<) 8) "89:;.) &'$"#">") '<)

9'/.)>-8$)'$.)">/'T.(),.)&'$$.&>)>-.9)#$>')'$.)">/'T.)

<'/)"#9:;#&#>B2)C;;)>-.)"89:;.")8/.)$'/98;#U.?)>')8)"#U.)

'<) *+MVQM2) G.) "89:;.) +M) :'#$>") ,#>-) .W58;) ":8&.)

#$>./H8;)8;'$4)>-.):.$)>/8&.)'<).8&-)"89:;.2)@-.$).8&-)

'<) >-.) >.">#$4) "89:;.") ,8") /'>8>.?) KB) 8$) 8$4;.)

/8$?'9;B) 4.$./8>.?) K.>,..$) LE!X+()!X+N2) A'9.) '<) >-.)

/'>8>.?)>.">#$4)"89:;.")8/.)"-',$)#$)3#42)+2))

F@G)-8?)K..$):/'H.?)>')K.)8$).<<#&#.$>)9.>-'?)>')

&8;&5;8>.)?#">8$&.)#$)'$;#$.)-8$?,/#>#$4)/.&'4$#>#'$)LRE

YN2)Z./.),.)".;.&>)>-.)"#9:;.)+E[.8/.">)[.#4-K'/)%[[!)

&;8""#<#./) >') &'9:8/.) >-.) /.&'4$#>#'$) :./<'/98$&.) '<)

>,') ?#">8$&.) 9.>/#&"\) F@G) 8$?) CDEF@G2) C") >-.)

'$;#$.) "89:;.") -8H.) K..$) $'/98;#U.?() >') K.) <8#/) <'/)

F@G) 8$?) CDEF@G() ,.) 5".) >-.) /'>8>#'$) 8$?) "-#<>)

#$H8/#8$>) ,#>-) <#7.?) "&8;.) /8>.) "]*) %"..) A.&>#'$) R2R!2)

@-.)/.&'4$#>#'$)/."5;>")8/.)"5998/#U.?)#$)@8K;.)*2)G.)

&8$)<#$?)>-8>)>-.)CDEF@G):./<'/9.?)95&-)K.>>./)>-8$)

>-.) F@G) 9.>-'?) ?#?() ,-#&-) #$?#&8>.?) >-8>) >-.) CDE

F@G) #") 8) 9'/.) .<<.&>#H.) 9.>-'?) >') &'9:8/.) '$;#$.)

/'>8>.?) "-8:.") >-8$) &;8""#&8;) F@G2) @-.) 98#$)

?/8,K8&T) '<) 5"#$4) [[) &;8""#<#./) #") #>") ;',) ":..?2)

Z',.H./()'$.)&8$)/.?5&.)>-.)&'9:5>8>#'$)>#9.)4/.8>;B)

KB)5"#$4);.8/$.?):/'>'>B:.")LPN2)F5.)>'):84.);#9#>8>#'$()

,.)-8H.)>')'9#>)>-.)?.>8#;.?)?#"&5""#'$)'$)>-#"2)

)

@8K;.)*)=.&'4$#>#'$)/8>.")'<)F@G)8$?)CDEF@G)

%+)#")>-.)$59K./)'<)$.#4-K'/")5".?!)

+, *) R) Y) *M)

F@G) ^P2MM_ ^P2PP_) ^Y2PP_) ^Y2SO_

CDEF@G QY2YP_ QY2**_) QP2^Y_) QP2MS_

)

5.Conclusion )

@-#") :8:./) :/':'".") >-.) 8<<#$.) #$H8/#8$>) ?B$89#&)

>#9.),8/:#$4)9.>-'?)>')&'9:8/.)>,')".W5.$&."),-#&-)

98B)-8H.);'&8;)"-8:.)?.<'/98>#'$)8$?)5$?./4')4;'K8;)

5$T$',$) 8<<#$.) >/8$"<'/92) G.) <'/95;8>.?) >-.)

>/8$"<'/98>#'$)98>/#7)8$?),8/:#$4):8>-) #$>')8)5$#<#.?)

':>#98;):/'K;.9)8$?)?.H.;':.?)8$) #>./8>#H.)8;4'/#>-9)

>') <#$?) >-.) ':>#98;) "';5>#'$2) 67:./#9.$>") '$) '$;#$.)

/'>8>.?) -8$?,/#>#$4) #$?#&8>.?) >-8>) '5/) 9.>-'?) &8$)

"#4$#<#&8$>;B)#9:/'H.)>-.)/.&'4$#>#'$)/8>.")'<)>-.)F@G2)

3'/) <5>5/.) ,'/T() ,.) 8/.) &'$"#?./#$4) *!) >') ">5?B) >-.)

/.45;8>#'$")'<)>/8$"<'/98>#'$)98>/#7)8$?)+!)>')&'9K#$.)

>-.)">8>#">#&8;);.8/$#$4)9.>-'?"),#>-)>-.)CDEF@G2)

)

References )L*N) `2) =8K#$./) 5$?) a2Z2) b58$42) c35$?89.$>8;") '<) A:..&-)

=.&'4$#>#'$c2)-"*%!&(*,./00,-122)*QQR)

L+N) Z2) A8T'.) 8$?) A2) J-#K8() cC) FB$89#&) d/'4/899#$4) C;4'/#>-9)

e:>#9#U8>#'$)<'/)A:'T.$)G'/?)=.&'4$#>#'$(c)3444,1"/%56,7%,899-()

+^%*!\)PREPQ2)*QOS)

LRN) Z2) f#>'98() A2#) I&-#?8() 8$?) Z2) A8T'.() ce$;#$.) &-8/8&>./)

/.&'4$#>#'$)5"#$4).#4.$E?.<'/98>#'$"(c)):!;,3<.=2)@'TB'()+MMP)

LPN) b2) C;'$() g2) C>-#>"'"() 8$?) A2) A&;8/'<<2) ce$;#$.) 8$?) e<<;#$.)

J-8/8&>./) =.&'4$#>#'$) I"#$4) C;#4$9.$>) >') d/'>'>B:."(c) -"7(6,

3>?82()h'/.8)+MMY

LYN)J2)a8-;98$$)8$?)Z2)a5/T-8/?>()c@-.),/#>./) #$?.:.$?.$>)'$;#$.)

-8$?,/#>#$4)/.&'4$#>#'$)"B">.9)<;'4)'$)-8$?)8$?)&;5">./)4.$./8>#H.)

">8>#">#&8;)?B$89#&)>#9.),8/:#$4(c)3444,1"/%56,7%,-8@3()+^%R!\+QQE

R*M()+MMP)

L^N)@2)=8>-)8$?)=2)f8$98>-82)cG'/?)#984.)98>&-#$4)5"#$4)?B$89#&)

>#9.),8/:#$4c()-"7(6,>A-2()H';2)+2)::2)Y+*EY+O()+MMR)

LON) F#'$#"#'() J2=2d2i) h#9() Z2j2c[.,) <.8>5/.") <'/) 8<<#$.E#$H8/#8$>)

"-8:.)&;8""#<#&8>#'$(c)-"7(6,3>3-()::\+*RY)E)+*RS(,+MMP)

LSN)G#;;#89)Z)d2()A85;)C)@2()G#;;#89)@)g2()a/#8$)d)32()c[59./#&8;)

=.&#:.") #$) Jkk\) @-.) C/>) '<) A&#.$>#<#&2) J'9:5>#$4(c) >/BC"&)D*,

E%&F*"5&!#,-"*55()J89K/#?4.()6$4;8$?()*QQ+)

LQN) F.9:">./() C2() `8#/?() [2() 8$?) =5K#$() F2) cf87#959) ;#T.;#-''?)

</'9)#$&'9E:;.>.)?8>8)H#8)>-.)6f)8;4'/#>-92(c)G7H"%/0,7I,!;*,27#/0,

9!/!&5!&(/0,97(&*!#J,9*"&*5,K(RQ%*!\*lRS2)*QOO)

L*MN) Z.>>#&-) A(2) ) a;8T.) J2`() 8$?) f./U) J2b2) cIJD) =.:'"#>'/B) '<)

98&-#$.) ;.8/$#$4) ?8>8K8"."c() *QQS)

L->>:\XX,,,2#&"25&#2.?5Xm9;.8/$Xf`=.:'"#>'/B2->9;N)

Affine invariant DTW52

!"#$%&'#()*!+ ,-.+ *"/"-((0+ 1*%&'2)+ '.)+ -.'/3)&+

4"&1.5+ /3)+ %&',)!!6+ 7.4+ 18+ /3)+ !9"-&)+ '8+ :",(14)-.+

41!/-.,)+ 1!+ "!)4+ /'+ 1.+ /3)+ !),'.4+ !"#$%&'#()*+ /'+

,-(,"(-/)+!;"#<+ $%=<+ /3)+ ,'.2)&5).,)+'8+ /31!+*)/3'4+ ,-.+

#)+ -.-(0>)4+ #0+ -+ !1*1(-&+ ?-0+ '8+ :@%),/-/1'.$

A-@1*1>-/1'.+ ;:A=+ -(5'&1/3*+ BCD6+ E3)+ 4)/-1(!+ '8+ '"&+

-(5'&1/3*+-&)+-!+8'(('?!F+

Optimization Algorithm of AI-DTW:

G.1/1-(1>)++

E3)+?-&%1.5+%-/3+&;H=I'()*+,(-;(<+.=6+

G/)&-/1'.+."*#)&+/IH+

J31()+.'/+,'.2)&5).,)+

++++++++++++/+I/KHL+

++++++++++++M%4-/)+/3)+/&-.!8'&*-/1'.+*-/&1@+#0F+

++++++++++++++++++++++ ; H=

N; =

; =H

-&5*1.O P/

0/

# & #, #

, " $ ,!

"

" !# +6++

M%4-/)+/3)+?-&%1.5+%-/3+#0F+

&;/=I'()*+,(-;(<+.,;/==6+

:.4+J31()++

Q'*)+ )@-*%()!+ '8+ /3)+ -#'2)+ -(5'&1/3*+ -&)+

1(("!/&-/)4+ 1.+R15"&)+H<+-.4+'.)+,-.+ 81.4+ /3-/+ /3)+*'!/+

)&&'&+ &)4",/1'.+ 1!+ 4'.)+ 1.+ /3)+ 81&!/+ 1/)&-/1'.!+ -.4+ /3)+

7G$SEJ+,-.+"!"-((0+,'.2)&5)+1.+-+8)?+1/)&-/1'.!6+++

3.3 Rotation, Scale and Shift Invariant DTW +

+++G.+*-.0+N$41*).!1'.+!3-%)+-.-(0!1!+-%%(1,-/1'.!<+

!",3+-!+'.(1.)+3-.4?&1/1.5<+1/+1!+'8/).+.),)!!-&0+/'+"!)+

&'/-/1'.+ !,-()+ -.4+ !318/+ 1.2-&1-./+ ;TQQ+ 1.2-&1-./+ 8'&+

!3'&/=+*)/3'46+T'/-/1'.+!,-()+-.4+!318/+,-.+#)+!)).+-!+

!%),1-(+,-!)!+'8+-881.)+/&-.!8'&*6+U)&)<+?)+41!,"!!+/31!+

%&'#()*+!)%-&-/)(0+8'&+/3)+8'(('?1.5+&)-!'.!F+

;H= E3)+ TQQ+ 1.2-&1-./+ %&'#()*+ ,-.+ #)+ !'(2)4+ *'&)+

)881,1)./(0+?1/3'"/+/3)+*-/&1@+,-(,"(-/1'.+1.+:9+;C=6+

;N= G.+ !'*)+ -%%(1,-/1'.!<+ /3)+ 7G$SEJ+ *-0+ '2)&$81/+

/3)+ 1.%"/+ !)9").,)6+ R'&+ )@-*%()+ 1.+ /3)+ '.(1.)+

3-.4?&1/1.5<+ /3)+ !-*%()!+ '8+ 4151/+ VNV<VWV+ ,-.+ #)+

/&-.!8'&*)4+!3-%)!+2)&0+ !1*1(-&+ /'+4151/+ VHV<+ 18+?)+

?&1.X()+ /3)+ !-*%()!+ -('.5+1+ ,''&41.-/)6+E3"!+ 8'&+

/3)!)+)@-*%()!<+ 1/+ 1!+*'&)+.-/"&-(+ /'+ !)/+ /3)+ !,-()+

&-/)!+'8+1+,''&41.-/)+-.4+2+,''&41.-/)+-!+/3)+!-*)6+

Y12).+-+%'1./+ ;1<+2=<+ /3)+,''&41.-/)!+ -8/)&+ &'/-/1'.<+

!,-()+-.4+!318/+/&-.!8'&*-/1'.+,-.+#)+,-(,"(-/)4+#0F+

13I+$,'!;!=+1K+$!1.;!=2K!1+

23I$$!1.;!=+1K+$,'!;!=2K!2<44444444++++++++++;HZ=++?3)&)+$+4).'/)!+/3)+!,-()+&-/)<+!+1!+/3)+&'/-/1'.+-.5()<+

-.4+ !1<+ !2+ &)%&)!)./!+ /3)+ !318/+ %-&-*)/)&!6+ R'&+

!1*%(1,1/0<+?)+4).'/)+/3)+/&-.!8'&*-/1'.+1.+:96+;HZ=+#0+

;13<+ 23=I5;1<+ 2=<+ -.4+ -!!"*)+ /3-/+ )-,3+ )()*)./+ "#I+ ;1"6#<+

2"6#=+ 1!+ -+ N$41*).!1'.+ 2),/'&+ ,'*%'!)4+ #0+ /3)+

,''&41.-/)!+ + '8+ %'1./+ "#6+ Q1*1(-&(0+ $%I+ ;1$6#<+2$6#=6+Y12).+

?-&%1.5+ %-/3+ &<+ /3)+ 81&!/+ !"#+ '%/1*-(+ %&'#()*+

41!,"!!)4+1.+Q),/1'.+[6N+,-.+#)+&)$8'&*"(-/)4+-!F+

#"

!"0

#

#&#78$

$5".(9H

N

=;<<<

=;=<;*1.-&5$

6+++++++++++;HH=+

\0+ !'(21.5+ /3)+ 8".,/1'.!+ "9]"$IZ<+ + "9]"!IZ<4

"9]"8IZ<4"9]"7IZ<+?)+3-2)F+

H/-. ; =!

:$

!" <+! ;!:$ ]NN

%" <+

! "! #! $! %!

!

"!

#!

$!

%!

&!!

&"!

'()*+,-

./! ! /! &!!&!

"!

0!

#!

/!

$!

1!

%!

2!

&!!

&&!'()*+,3,45567"$62!"%

! "! #! $! %!

!

"!

#!

$!

%!

&!!

&"!

'+56,&,455671602&/

! "! #! $! %!

!

"!

#!

$!

%!

&!!

&"!

'+56,#,45567/60%#&

! / &!!

/

&!

&/

"!

"/

0!

'+458+9:(

;55:5

++

! "! #! $! %!

!

"!

#!

$!

%!

&!!

&"!

'()*+,-

"! #! $! %!!

"!

#!

$!

%!

&!!

&"!'()*+,3 ,455670$62!1/

! "! #! $! %!

!

"!

#!

$!

%!

&!!

&"!

'+56,&,4556726%!2%

! "! #! $! %!

!

"!

#!

$!

%!

&!!

&"!

'+56,#,45567$6%!00

! / &!!

/

&!

&/

"!

"/

0!

'+458+9:(

;55:5

+R15"&)+ H+ :@-*%()!+ '8+ 7G$SEJ6+ E'%+ &'?F+ 4151/+ N6+ \'//'*+ &'?F+ 4151/+ ^6+ _)8/+ ,'("*.F+ G.%"/+ %-//)&.+ EL+ N.4+

,'("*.F+G.%"/+%-//)&.+T+-8/)&+&'/-/1'.6+[&4+,'("*.F+%-//)&.+T+-8/)&+ /3)+81&!/+ 1/)&-/1'.+'8+7G$SEJ6+`/3+,'("*.F+

%-//)&.+T+-8/)&+/3)+`/3+1/)&-/1'.L+T153/+,'("*.F+:&&'&+,"&2)+-('.5+1/)&-/1'.+."*6+

Yu Qiao and Makoto Yasuhara. 2006. Affine Invariant Dynamic Time Warping and its Application to Online Rotated Handwriting Recognition. In Proceedings of the 18th International Conference on Pattern Recognition - Volume 02 (ICPR '06), Vol. 2. IEEE Computer Society, Washington, DC, USA, 905-908. DOI=10.1109/ICPR.2006.228 http://dx.doi.org/10.1109/ICPR.2006.228

http://dx.doi.org/10.1109/ICPR.2006.228

Multi-Dimensional Dynamic Time Warping

¨ Comment appliquer DTW quand plusieurs grandeurs sont mesurées à chaque instant?

¨ Modification de la fonction de distance: ¤ Chaque dimension est normalisée avec une moyenne de

0 et une variance de 1 ¤ Calcul de la somme des valeurs absolues des

différences suivant chaque dimension

53

G. A. ten Holt, M. J. T. Reinders and E. A. Hendriks. Multi-dimensional dynamic time warping for gesture recognition. In Proceedings of the Thirteenth annual conference of the Advanced School for Computing and Imaging, 2007.

Multi-Dimensional Dynamic Time Warping54

Figure 3: The necessity for MD-DTW. (a) shows two artificial 2D time series of equal mean and variance. Di-mension 1 is shown in column 1, dimension 2 in column 2. The series contain synchronisation information inboth dimensions. If 1D-DTW is performed using the first dimension, the result is suboptimal for dimension 2, asillustrated in (b). Note that the peaks and valleys in dimension 2 are not aligned properly. 1D-DTW on dimension2 gives a similar suboptimal match for dimension 1. MD-DTW takes both dimensions into account and finds thebest synchronisation (c).

result. MD-DTW takes both dimensions into accountin finding the optimal synchronisation. The result isa synchronisation that is as ideal as possible for bothdimensions, as shown in figure 3 (c). This is the ad-vantage of MD-DTW over regular DTW.

Though the series in figure 3 is artificial, the sit-uation it depicts is not unrealistic. Figure 4 shows asimilar situation from a real-world multi-dimensionalseries: x- and y-co-ordinates of the right hand for theDutch sign for acrobat. Here, the y-co-ordinate is in-formative for the first part of the series, and the x-co-ordinate for the second part. To get both peaksproperly matched, MD-DTW is necessary.

2.2 DTW on Derivatives[4] argue that for synchronising shape-

characteristics of series (such as peaks and slopes),it is beneficial to perform DTW on the first-orderderivatives of the feature values (DDTW). Since wewant to perform such shape-matching in each of ourdimensions, we considered this option for the MD-DTW algorithm. In our case, this meant taking thefirst-order derivative in each dimension separately.This gives us information about the slopes and peaksin each dimension. The series were first smoothed

in each dimension with a Gaussian filter (� = 5) todiminish noise effects. Then an approximation of thederivative was taken in each dimension using the filterder(a(t)) = (a(t + 1)� a(t� 1))/2.

Now we can perform two types of MD-DTW: onthe feature values and on their derivatives. As a thirdtype, we took the derivatives and added them to theseries as extra dimensions, doubling the dimensional-ity of the series. In this setting, both the feature valuesthemselves and their derivatives are taken into consid-eration when searching for the ideal warp. We testedour algorithm in all three settings. They are denotedas S(ignal), D(erivative) and SD(signal+derivative).In each case, the series were smoothed and nor-malised before (MD)DTW was commenced.

2.3 Dimension SelectionTo compare series with various dimensions, it is

necessary to normalise the dimensions. However, thisis a problem if a dimension contains only noise. Forexample, position data gathered on the non-dominanthand in a one-handed gesture. Normalisation willenlarge the noise in this dimension to the same pro-portions as the informative data in other dimensions(such as dominant hand positions). The noise will

Mutli-Dimensional Dynamic Time Warping

¨ Possibilité d’utiliser les signaux et leur dérivée55

TP56

Documents

RECONNAISSANCE DE GESTEScasiez/NIHM/ReconnaissanceDeGestes.pdf · ¤ Traitement des points pour éliminer les points trop proches: tout point dont la distance est inférieure à 3