Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
RECONNAISSANCE DE GESTESGéry Casiez http://cristal.univ-lille.fr/~casiez/ M2 IVI - NIHM
1
Minority report2
Iron man3
Plan
¨ Objectifs ¨ Technique libstroke ¨ Technique $1 ¨ Rubine ¨ DTW
4
Objectifs5
¨ Reconnaître des gestes réalisés par l’utilisateur pour les associer à des commandes
¨ On se limite aux gestes un seul trait (unistroke)
Unistroke - Graffiti (Palm OS)
Multistroke
Technique de reconnaissance de gestes élémentaires: libstroke (1997)
6
¨ Librairie disponible dans FVWM http://etla.net/libstroke/ ¨ Les gestes sont décrits par des séquences de
chiffres
Libstroke: algorithme7
¨ 1) détermination de la boite englobante (bounding box) à partir de minx, maxx,miny, maxy
¨ 2) division de la boite en 9 cellules ¨ 3) étiquetage des points ¨ 4) factorisation ¨ 5) comparaison aux motifs enregistrés
¨ Extension Firefox: Mouse Gestures Redox
Techniques de reconnaissance8
¨ Classifieurs statistiques (Rubine, DTW) ¨ Modèles de Markov cachés ¨ Réseaux de neurones ¨ Méthodes ad-hoc (libstroke, $1)
Techniques de reconnaissance9
¨ $1 recognizer ¨ Méthode simple à implémenter et donnant des taux
de reconnaissance comparables à Rubine et DTW ¨ 1 seul exemple suffit ¨ Technique publiée dans l’article de recherche
"Gestures without Librairies, Toolkits or Training: A $1 Recognizer for User Interface Prototypes" à la conférence User Interface Sofware and Technology (UIST) en 2007 (écrit par Wobbrock, Wilson et Li)
$1 recognizer10
Algorithme11
¨ 1) L’utilisateur réalise un geste ¤ Le geste est représenté par une liste ordonnée de points
¨ 2) Ce geste est comparé à un ensemble de gestes de référence (templates) en utilisant une mesure de distance euclidienne
¨ 3) Le geste reconnu est celui pour lequel cette distance est minimale
Problèmes posés12
¨ Le nombre de points pour définir un geste va dépendre de la vitesse d’exécution, de la fréquence d’échantillonnage du périphérique…
Problèmes posés13
¨ Un geste peut être réalisé à différentes positions sur l’interface,
¨ Suivant différentes orientations
¨ Et différentes échelles
4 étapes14
¨ 1) Ré-échantillonnage du geste pour être invariant à la fréquence d’acquisition et la vitesse d’exécution du geste
¨ 2) Ré-orientation du geste pour être invariant à l’orientation suivant laquelle il est exécuté
¨ 3) Mise à l’échelle et translation pour être invariant à l’échelle de réalisation du geste et la position à laquelle il est exécuté
¨ 4) Reconnaissance du geste
1ère étape: Ré-échantillonnage15
¨ Le geste est défini par M points ordonnés ¨ On veut N points ordonnés équidistants les uns des
autres ¨ N = 64
2e étape: Rotation « indicative » 16
¨ 1) Calcul du centre du geste (centroïde) ¨ 2) Calcul de l’angle entre le centroïde, le premier
point et l’horizontale ¨ 3) Rotation des points en utilisant cet angle
3e étape: mise à l’échelle et translation17
¨ Mise à l’échelle non uniforme: on ramène le geste à un carré de référence
¨ 1) détermination de la boîte englobante du geste (bounding box) ¤ Détermination de minx, maxx, miny, maxy
¨ 2) Mise à l’échelle ¨ 3) Translation à l’origine
4e étape: reconnaissance18
¨ Un geste candidat C est comparé à chacun des gestes enregistrés (templates) Ti pour calculer la distance moyenne di entre les points correspondants
¨ Le template avec la distance moyenne la plus faible est le résultat de la reconnaissance
¨ La distance est tranformée en score entre 0 et 1
Limitations19
¨ Pas possible de distinguer un carré d’un rectangle, une ellipse d’un cercle, une flèche vers le bas d’une flèche vers le haut
¨ Pas possible de reconnaître des gestes « 1D »
Classifieur statistique
¨ Rubine ¨ Reconnaissance de gestes ne comportant qu’un seul
tracé (unistroke) ¨ Traitement statistique de caractéristiques
20
Rubine
¨ Liste de points en entrée ¤ Traitement des points pour éliminer les points trop
proches: tout point dont la distance est inférieure à 3 pixels du point précédent est supprimé
¤ Calcul d’un vecteur de caractéristiques statistiques ¤ Comparaison aux différents gestes de référence. Le
geste avec le score le plus grand est renvoyé. ¤ Détection des gestes ambigus
21
Caractéristiques22
Initialisation
¨ Calcul des statistiques moyennes pour chaque caratéristique de chaque geste
¨ Calcul de la matrice de covariance moyenne de tous les gestes
¨ Objectif: trouver les coefficients pondérateurs des caractéristiques statistiques qui permettent de maximiser le score des gestes de chaque classe
23
Reconnaissance
¨ Calcul des caractéristiques du geste candidat ¨ Calcul d’une probabilité de correspondance pour
chaque geste de référence ¨ Le geste avec la probabilité la plus importante est
choisi
24
Dynamic Time Warping
¨ DTW ¨ Déformation temporelle dynamique ¨ Déterminer pour chaque élément d’une séquence, le
meilleur élément correspondant dans l’autre séquence relativement à un certain voisinage et à une certaine métrique
¨ Complexité polynomiale
25
Applications
¨ Vidéo, audio, graphique... ¨ Toutes données qui peuvent être transformées en
représentation linéaire en fonction du temps (séries temporelles)
¨ Echantillons ordonnés par une étiquette de temps ¨ Reconnaissance vocale ¨ Reconnaissance de gestes off-line et on-line ¨ alignement de protéines...
26
Principe de base
¨ Séquence de référence R = [r1, r2, ..., rn] ¨ Séquence de test T = [t1, t2, ..., tm] ¨ Si m=n alors on peut calculer la distance entre les deux
signaux de la façon suivante (pas forcément idéal):
¨ Possibilité de calculer la distance euclidienne ou d’utiliser une autre métrique (fonction qui donne un réel)
¨ Plus possible d’utiliser cette méthode dès que n ≠ m ¨ Attention: les échantillons des séquences doivent être
équidistants en temps
27
D =n�
i=1
distance(ri, ti) (1)
1
Principe de base
¨ DTW réalise d’abord un alignement non linéaire en recherchant parmi tous les alignements possibles, celui qui minimise une fonction de coût cumulé
¨ «Time Warping»: Dilation ou compression des séquences pour obtenir le meilleur alignement possible
28
Principe de base
¨ Détermination du chemin W = [w1, w2, ..., wk] de longueur minimale
29 D =n�
i=1
distance(ri, ti) (1)
k�
i=1
distance(wi) (2)
1
Principe de base
¨ Conditions aux frontières ¤ w1 = (r1, t1) ¤ wk = (rn, tm)
¨ Contraintes locales ¤ Monotonicité pour respecter le séquencement des points ¤ Eviter les sauts dans le temps ¤ Pour tout couple (ri, tj), le choix des prédécesseurs est limité à
n (ri-1,tj), (ri,tj-1), (ri-1,tj-1)
¨ Exhaustivité ¤ Chaque élément de R doit être mis en relation avec au moins
un élément de T et vice-versa ¤ max(m, n) ≤ k ≤ m + n - 1
30
Principe de base31
Principe de base
¨ Programmation dynamique ¨ Fonction d’optimisation:
¤ Soit D(i,j) la longueur du chemin entre (r1, t1) et (ri, tj) ¨ Récursion:
¤ D(i,j) = dist(ri,tj) + min(D(i-1,j), D(i-1,j-1), D(i, j-1)) ¤ condition initiale: D(1,1) = dist(r0,t0)
¨ Distance minimale entre les deux séquences ¤ D(n,m)
32
Mise en application
¨ Construction d’une matrice D de dimensions n x m dans laquelle chaque élément (i,j) contient D(i,j)
¨ Remplissage de D(1,1) avec la condition initiale ¨ Utilisation de la formule récursive pour remplir la
matrice ligne par ligne ou colonne par colonne ¨ Cas particuliers première ligne et première colonne ¨ Distance minimale donnée par l’élément D(n,m) ¨ La distance minimale peut être normalisée par la
longueur du chemin
33
Algorithme34
i
i
00 00
0 0 0
Chemin de déformation35
Chemin de déformation
¨ A chaque calcul de D(i,j), sauvegarde du prédécesseur qui minimise la distance
¨ Parcours des prédécesseurs en partant de D(n,m)
36
Complexité
¨ Complexité: O(m*n) ¨ Optimisation en limitant la région de recherche
37
Optimisation
¨ La zone supérieure gauche et la zone inférieure droite ne sont pas calculées
¨ Les distances locales associées sont mises à une valeur très élevée afin que le chemin n’y passe pas
38
Optimisation39
Optimisation
¨ Mise à l’échelle ¤ Réduire les tailles de R et T
40
Singularités
¨ Une singularité apparaît quand un point d’une séquence est associé à de nombreux points de l’autre séquence sans raison valable
41
2
Although DTW has been successfully used in many domains, it can produce
pathological results. The crucial observation is that the algorithm may try to explain
variability in the Y-axis by warping the X-axis. This can lead to unintuitive alignments
where a single point on one time series maps onto a large subsection of another time
series. We call examples of this undesirable behavior "singularities". A variety of ad-hoc
measures have been proposed to deal with singularities. All of these approaches
essentially constrain the possible warpings allowed. However they suffer from the
drawback that they may prevent the "correct" warping from being found.
In simulated cases, the correct warping can be known by warping a time series and
attempting to recover the original (see Section 4). In naturally occurring cases we take
"correct" to mean intuitively obvious "feature to feature" alignment as in Figure 2.B.
An additional problem with DTW is that the algorithm may fail to find obvious,
natural alignments in two sequences simply because a feature (i.e peak, valley, inflection
point, plateau etc.) in one sequence is slightly higher or lower than its corresponding
feature in the other sequence. Figure 2 illustrates this problem.
Figure 2 : A) Two synthetic signals (with the same mean and variance). B) The natural "feature to
feature" alignment. C) The alignment produced by dynamic time warping. Note that DTW failed to align
the two central peaks because they are slightly separated in the Y-axis
In this paper we address both these problems by introducing a modification of DTW.
The crucial difference is in the features we consider when attempting to find the correct
warping. Rather than use the raw data, we consider only the (estimated) local derivatives
of the data.
The rest of the paper is organized as follows. Section 2 contains a review of the classic
DTW algorithm, including the various techniques suggested to prevent singularities. In
Section 3 we introduce and demonstrate our extension, which we call Derivative
Dynamic Time Warping (DDTW). Section 4 contains experimental results, and in
Section 5 we offer conclusions and discuss possible directions for future work.
0 5 10 15 20 25 30 0 5 10 15 20 25 300 5 10 15 20 25 30
A B C
Singularités: technique de fenêtrage
¨ Ajout de contraintes dans les alignements possibles ¨ Technique de fenêtrage
¤ Donne une borne supérieure à la singularité
42
for j=max(1,i-w); j <= min (m,i+w); j++ doi
Singularités: slope weighting
¨ Pondérer les déplacements suivant les directions: ¤ D(i,j) = dist(ri,tj) + min(X*D(i-1,j), D(i-1,j-1), X*D(i, j-1)) ¤ X>1
¨ Force l’alignement suivant la diagonale
43
Singularités: Step patterns
¨ D(i,j) = dist(ri,tj) + min(D(i-1,j-2), D(i-1,j-1), D(i-2, j-1)) ¨ Force le déplacement suivant la diagonale
44
5
We could replace equation 5 with !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j-2) , !(i-
2,j-1) }, which corresponds with the step-pattern show in Figure 4.B. Using this
equation the warping path is forced to move one diagonal step for each step
parallel to an axis. Dozens of different step-patterns have been considered,
Rabiner and Juang (1993) contains a review.
All the above may help to mitigate the problem of singularities, but at the risk of
missing the correct warping. Additionally, it is not obvious how to chose the various
parameters (R for Windowing and X for Slope Weighting) or Step-Pattern.
3 Derivative dynamic time warping
If DTW attempts to align two sequences that are similar except for local accelerations
and decelerations in the time axis, the algorithm is likely to be successful. The algorithm
has problems when the two sequences also differ in the Y-axis. Global differences,
affecting the entire sequences, such as different means (offset translation), different
scalings (amplitude scaling) or linear trends can be efficiently removed (Keogh and
Pazzani 1998, Agrawal et. al. 1995). However the two series may also have local
differences in the Y-axis, for example a valley in one sequence may be deeper that the
corresponding valley in the other time series. Consider Figure 5 as an example. Two
identical sequences will clearly produce a one to one alignment. But if we slightly
change a local feature, in this case the depth of a valley, DTW attempts to explain the
difference in terms of the time-axis and produces two singularities.
Figure 5: Using DTW, two identical sequences (a) will clearly produce a one to one alignment (b).
However, if we slightly change a local feature, in this case the depth of a valley (c), DTW attempts to
explain the difference in terms of the time-axis and produces two singularities (d).
B A
Figure 4: A pictorial representation of two alternative step-patterns:
A) The pattern corresponding to !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j ) , !(i,j-1) }
B) The pattern corresponding to !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j-2) , !(i-2,j-1) }
a b dc
Singularités
¨ Problème: ces contraintes peuvent empêcher d’obtenir l’alignement correct
¨ Comment définir les différents paramètres des techniques de levée des singularités (taille de fenêtre, coefficients de pondération, motifs)?
45
DDTW: Derivative Dynamic Time Warping46
5
We could replace equation 5 with !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j-2) , !(i-
2,j-1) }, which corresponds with the step-pattern show in Figure 4.B. Using this
equation the warping path is forced to move one diagonal step for each step
parallel to an axis. Dozens of different step-patterns have been considered,
Rabiner and Juang (1993) contains a review.
All the above may help to mitigate the problem of singularities, but at the risk of
missing the correct warping. Additionally, it is not obvious how to chose the various
parameters (R for Windowing and X for Slope Weighting) or Step-Pattern.
3 Derivative dynamic time warping
If DTW attempts to align two sequences that are similar except for local accelerations
and decelerations in the time axis, the algorithm is likely to be successful. The algorithm
has problems when the two sequences also differ in the Y-axis. Global differences,
affecting the entire sequences, such as different means (offset translation), different
scalings (amplitude scaling) or linear trends can be efficiently removed (Keogh and
Pazzani 1998, Agrawal et. al. 1995). However the two series may also have local
differences in the Y-axis, for example a valley in one sequence may be deeper that the
corresponding valley in the other time series. Consider Figure 5 as an example. Two
identical sequences will clearly produce a one to one alignment. But if we slightly
change a local feature, in this case the depth of a valley, DTW attempts to explain the
difference in terms of the time-axis and produces two singularities.
Figure 5: Using DTW, two identical sequences (a) will clearly produce a one to one alignment (b).
However, if we slightly change a local feature, in this case the depth of a valley (c), DTW attempts to
explain the difference in terms of the time-axis and produces two singularities (d).
B A
Figure 4: A pictorial representation of two alternative step-patterns:
A) The pattern corresponding to !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j ) , !(i,j-1) }
B) The pattern corresponding to !(i,j) = d(i,j) + min{ !(i-1,j-1) , !(i-1,j-2) , !(i-2,j-1) }
a b dc
@INPROCEEDINGS{Keogh01derivativedynamic, author = {Eamonn J. Keogh and Michael J. Pazzani}, title = {Derivative Dynamic Time Warping}, booktitle = {In First SIAM International Conference on Data Mining (SDM’2001}, year = {2001}}
DDTW: Derivative Dynamic Time Warping
¨ Utilisation de la dérivée des données des séquences
¨ Calcul du carré de la différence des dérivées
47
6
The weakness of DTW is in the features it considers. It only considers a datapoints Y-
axis value. For example consider two datapoints qi and c
j which have identical values, but
qi
is part of a rising trend and cj is part of a falling trend. DTW considers a mapping
between these two points ideal, although intuitively we would prefer not to map a rising
trend to a falling trend. To prevent this problem we propose a modification of DTW that
does not consider the Y-values of the datapoints, but rather considers the higher level
feature of "shape". We obtain information about shape by considering the first derivative
of the sequences, and thus call our algorithm Derivative Dynamic Time Warping
(DDTW).
3.1 Algorithm details
As before we construct an n-by-m matrix where the (ith, j
th) element of the matrix
contains the distance d(qi,c
j) between the two points q
i and c
j. With DDTW the distance
measure d(qi,c
j) is not Euclidean but rather the square of the difference of the estimated
derivatives of qi and c
j. While there exist sophisticated methods for estimating
derivatives, particularly if one knows something about the underlying model generating
the data, we use the following estimate for simplicity and generality.
1 < i < m (6)
This estimate is simply the average of the slope of the line through the point in
question and its left neighbor, and the slope of the line through the left neighbor and the
right neighbor. Empirically this estimate is more robust to outliers than any estimate
considering only two datapoints. Note the estimate is not defined for the first and last
elements of the sequence. Instead we use the estimates of the second and penultimate
elements respectively. For noisy datasets we use exponential smoothing (Mills 1990)
before attempting to estimate the derivatives.
DDTW’s time complexity is O(mn), which is the same as DTW. There are some added
constant factors because of derivative estimating step, but there is no need to remove
offset translation, a necessary step for DTW. Empirically the two algorithms take
approximately the same time. A variety of optimizations have been proposed for DTW
(Myers et. al 1990). We omit discussion of them here for brevity, but note that they apply
equally well to DDTW.
Figure 6: 1) Two artificial signals. 2) The intuitive feature to feature warping alignment. 3) The
alignment produced by classic DTW. 4) The alignment produced by DDTW.
2
)2)(()(][ 111 !+! !+!= iiii
x
qqqqqD
0 5 10 15 20 25 30
4
1 2
3
6
The weakness of DTW is in the features it considers. It only considers a datapoints Y-
axis value. For example consider two datapoints qi and c
j which have identical values, but
qi
is part of a rising trend and cj is part of a falling trend. DTW considers a mapping
between these two points ideal, although intuitively we would prefer not to map a rising
trend to a falling trend. To prevent this problem we propose a modification of DTW that
does not consider the Y-values of the datapoints, but rather considers the higher level
feature of "shape". We obtain information about shape by considering the first derivative
of the sequences, and thus call our algorithm Derivative Dynamic Time Warping
(DDTW).
3.1 Algorithm details
As before we construct an n-by-m matrix where the (ith, j
th) element of the matrix
contains the distance d(qi,c
j) between the two points q
i and c
j. With DDTW the distance
measure d(qi,c
j) is not Euclidean but rather the square of the difference of the estimated
derivatives of qi and c
j. While there exist sophisticated methods for estimating
derivatives, particularly if one knows something about the underlying model generating
the data, we use the following estimate for simplicity and generality.
1 < i < m (6)
This estimate is simply the average of the slope of the line through the point in
question and its left neighbor, and the slope of the line through the left neighbor and the
right neighbor. Empirically this estimate is more robust to outliers than any estimate
considering only two datapoints. Note the estimate is not defined for the first and last
elements of the sequence. Instead we use the estimates of the second and penultimate
elements respectively. For noisy datasets we use exponential smoothing (Mills 1990)
before attempting to estimate the derivatives.
DDTW’s time complexity is O(mn), which is the same as DTW. There are some added
constant factors because of derivative estimating step, but there is no need to remove
offset translation, a necessary step for DTW. Empirically the two algorithms take
approximately the same time. A variety of optimizations have been proposed for DTW
(Myers et. al 1990). We omit discussion of them here for brevity, but note that they apply
equally well to DDTW.
Figure 6: 1) Two artificial signals. 2) The intuitive feature to feature warping alignment. 3) The
alignment produced by classic DTW. 4) The alignment produced by DDTW.
2
)2)(()(][ 111 !+! !+!= iiii
x
qqqqqD
0 5 10 15 20 25 30
4
1 2
3
DDTW: Derivative Dynamic Time Warping48
8
These results confirm what a casual perusal of Figure 7 tells us. DTW attempts to
correct minor differences between the sequences by wild warpings of the time axis. In
contrast DDTW, which considers higher levels features is much less inclined to "find" a
warping were none exists.
Figure 7: Examples of the three experimental datasets used in this paper. Each box in the leftmost
column contains two related sequences that do not contain time warping, but do have minor differences
in the Y-axis. The warpings "discovered" by the two algorithms discussed in this paper are therefore
completely spurious. The two rightmost columns show examples of the warpings returned by both
algorithms. Table 1 contains a numerical comparison.
4.2 Finding the correct warping
To test the ability of an algorithm to discover the correct warping between two
sequences we need sequences for which the correct warping is known. Our approach is to
take a sequence Q and to make a copy of it. This copy, Q’ then has a warping randomly
inserted into it. We can then use Q and Q’ as input into the two algorithms and compare
warpings.
To distort sequence Q’ we begin by randomly choosing an anchor point, somewhere on
the sequence. We first distort the Y-axis by adding or subtracting a Gaussian bump,
centered on the anchor point, to the sequence. We did this for 3 different height bumps
0.0, 0.1 and 0.2 (0.0 corresponding to no bump). Next, we randomly shifted the anchor
point 15 time-units left or right. The other datapoints were moved to compensate for this
shift by an amount that depended on their inverse squared distance to the anchor point,
thus localizing the effect. After the X-axis transformation we interpolated the data back
onto the original, equi-spaced X-axis. The net effect of the two transformations is a
smooth local distortion of the original sequence, as shown in Figure 8.
Space Shuttle
Exchange rate
EEG
DTW DDTW
Recherche de motifs49
Reconnaissance de gestes
¨ Pré-traitement des gestes de référence et du geste réalisé par l’utilisateur ¤ mise à l’échelle, changement origine, rotation…
¨ Alignement avec chacun des gestes ¨ Le geste de référence pour lequel la distance entre
les deux séquences est la plus faible est considéré comme reconnu
50
Affine invariant DTW
¨ Problème: identifier dans une séquence un objet qui subit une transformation affine
¨ Choix de la fonction de coût?
51
!!"#"$"$$ !"#$%!&'"% !! ""#$ ()
!!"#"$"## !&'"%!"#$% !! "%#$ ()))))))))))%*+!)
,-./.) &&##
##%
&
&'""
%
&
&'""#
%#$
%$
*
!%(
*
!%(
*(
*()
!0!%%!!%%1!%((
*
!%(( "&'"!&!
%
&
"&'"!&!####$$$$( """"""# &
#
()
!0!%%!!%%1!%((
*
!%(( "&'"!&!
%
&
"&'"!&!##$$$$##) ""%"""# &
#
()
0!%!%1 +
(
+
(
*
!&!!&!
%
&
##$$* "%"# &#
2))))))))))))))))))))))))))))))))
)
4. Experiments on Online Rotated
Handwriting Recognition
)3#45/.)+)6789:;.")'<)='>8>.?)@.">#$4)A89:;.")
)
@') .789#$.) >-.) 5>#;#>B) '<) CDEF@G() ,.) -8H.)
.7.&5>.?) /.&'4$#>#'$) .7:./#9.$>") '$) 8) ".>) '<) /'>8>.?)
'$;#$.)-8$?,/#>#$4)?#4#>)?8>8)</'9)>-.)IJD)=.:'"#>'/B)
'<) 98&-#$.) ;.8/$#$4) ?8>8K8".") L*MN2) @-.) ?8>8K8".)
#$&;5?.") O(PQP) >/8#$#$4) "89:;.") 8$?) R(PQS) ,/#>./)
#$?.:.$?.$>) >.">#$4) "89:;."2) D<) 8) "89:;.) &'$"#">") '<)
9'/.)>-8$)'$.)">/'T.(),.)&'$$.&>)>-.9)#$>')'$.)">/'T.)
<'/)"#9:;#&#>B2)C;;)>-.)"89:;.")8/.)$'/98;#U.?)>')8)"#U.)
'<) *+MVQM2) G.) "89:;.) +M) :'#$>") ,#>-) .W58;) ":8&.)
#$>./H8;)8;'$4)>-.):.$)>/8&.)'<).8&-)"89:;.2)@-.$).8&-)
'<) >-.) >.">#$4) "89:;.") ,8") /'>8>.?) KB) 8$) 8$4;.)
/8$?'9;B) 4.$./8>.?) K.>,..$) LE!X+()!X+N2) A'9.) '<) >-.)
/'>8>.?)>.">#$4)"89:;.")8/.)"-',$)#$)3#42)+2))
F@G)-8?)K..$):/'H.?)>')K.)8$).<<#&#.$>)9.>-'?)>')
&8;&5;8>.)?#">8$&.)#$)'$;#$.)-8$?,/#>#$4)/.&'4$#>#'$)LRE
YN2)Z./.),.)".;.&>)>-.)"#9:;.)+E[.8/.">)[.#4-K'/)%[[!)
&;8""#<#./) >') &'9:8/.) >-.) /.&'4$#>#'$) :./<'/98$&.) '<)
>,') ?#">8$&.) 9.>/#&"\) F@G) 8$?) CDEF@G2) C") >-.)
'$;#$.) "89:;.") -8H.) K..$) $'/98;#U.?() >') K.) <8#/) <'/)
F@G) 8$?) CDEF@G() ,.) 5".) >-.) /'>8>#'$) 8$?) "-#<>)
#$H8/#8$>) ,#>-) <#7.?) "&8;.) /8>.) "]*) %"..) A.&>#'$) R2R!2)
@-.)/.&'4$#>#'$)/."5;>")8/.)"5998/#U.?)#$)@8K;.)*2)G.)
&8$)<#$?)>-8>)>-.)CDEF@G):./<'/9.?)95&-)K.>>./)>-8$)
>-.) F@G) 9.>-'?) ?#?() ,-#&-) #$?#&8>.?) >-8>) >-.) CDE
F@G) #") 8) 9'/.) .<<.&>#H.) 9.>-'?) >') &'9:8/.) '$;#$.)
/'>8>.?) "-8:.") >-8$) &;8""#&8;) F@G2) @-.) 98#$)
?/8,K8&T) '<) 5"#$4) [[) &;8""#<#./) #") #>") ;',) ":..?2)
Z',.H./()'$.)&8$)/.?5&.)>-.)&'9:5>8>#'$)>#9.)4/.8>;B)
KB)5"#$4);.8/$.?):/'>'>B:.")LPN2)F5.)>'):84.);#9#>8>#'$()
,.)-8H.)>')'9#>)>-.)?.>8#;.?)?#"&5""#'$)'$)>-#"2)
)
@8K;.)*)=.&'4$#>#'$)/8>.")'<)F@G)8$?)CDEF@G)
%+)#")>-.)$59K./)'<)$.#4-K'/")5".?!)
+, *) R) Y) *M)
F@G) ^P2MM_ ^P2PP_) ^Y2PP_) ^Y2SO_
CDEF@G QY2YP_ QY2**_) QP2^Y_) QP2MS_
)
5.Conclusion )
@-#") :8:./) :/':'".") >-.) 8<<#$.) #$H8/#8$>) ?B$89#&)
>#9.),8/:#$4)9.>-'?)>')&'9:8/.)>,')".W5.$&."),-#&-)
98B)-8H.);'&8;)"-8:.)?.<'/98>#'$)8$?)5$?./4')4;'K8;)
5$T$',$) 8<<#$.) >/8$"<'/92) G.) <'/95;8>.?) >-.)
>/8$"<'/98>#'$)98>/#7)8$?),8/:#$4):8>-) #$>')8)5$#<#.?)
':>#98;):/'K;.9)8$?)?.H.;':.?)8$) #>./8>#H.)8;4'/#>-9)
>') <#$?) >-.) ':>#98;) "';5>#'$2) 67:./#9.$>") '$) '$;#$.)
/'>8>.?) -8$?,/#>#$4) #$?#&8>.?) >-8>) '5/) 9.>-'?) &8$)
"#4$#<#&8$>;B)#9:/'H.)>-.)/.&'4$#>#'$)/8>.")'<)>-.)F@G2)
3'/) <5>5/.) ,'/T() ,.) 8/.) &'$"#?./#$4) *!) >') ">5?B) >-.)
/.45;8>#'$")'<)>/8$"<'/98>#'$)98>/#7)8$?)+!)>')&'9K#$.)
>-.)">8>#">#&8;);.8/$#$4)9.>-'?"),#>-)>-.)CDEF@G2)
)
References )L*N) `2) =8K#$./) 5$?) a2Z2) b58$42) c35$?89.$>8;") '<) A:..&-)
=.&'4$#>#'$c2)-"*%!&(*,./00,-122)*QQR)
L+N) Z2) A8T'.) 8$?) A2) J-#K8() cC) FB$89#&) d/'4/899#$4) C;4'/#>-9)
e:>#9#U8>#'$)<'/)A:'T.$)G'/?)=.&'4$#>#'$(c)3444,1"/%56,7%,899-()
+^%*!\)PREPQ2)*QOS)
LRN) Z2) f#>'98() A2#) I&-#?8() 8$?) Z2) A8T'.() ce$;#$.) &-8/8&>./)
/.&'4$#>#'$)5"#$4).#4.$E?.<'/98>#'$"(c)):!;,3<.=2)@'TB'()+MMP)
LPN) b2) C;'$() g2) C>-#>"'"() 8$?) A2) A&;8/'<<2) ce$;#$.) 8$?) e<<;#$.)
J-8/8&>./) =.&'4$#>#'$) I"#$4) C;#4$9.$>) >') d/'>'>B:."(c) -"7(6,
3>?82()h'/.8)+MMY
LYN)J2)a8-;98$$)8$?)Z2)a5/T-8/?>()c@-.),/#>./) #$?.:.$?.$>)'$;#$.)
-8$?,/#>#$4)/.&'4$#>#'$)"B">.9)<;'4)'$)-8$?)8$?)&;5">./)4.$./8>#H.)
">8>#">#&8;)?B$89#&)>#9.),8/:#$4(c)3444,1"/%56,7%,-8@3()+^%R!\+QQE
R*M()+MMP)
L^N)@2)=8>-)8$?)=2)f8$98>-82)cG'/?)#984.)98>&-#$4)5"#$4)?B$89#&)
>#9.),8/:#$4c()-"7(6,>A-2()H';2)+2)::2)Y+*EY+O()+MMR)
LON) F#'$#"#'() J2=2d2i) h#9() Z2j2c[.,) <.8>5/.") <'/) 8<<#$.E#$H8/#8$>)
"-8:.)&;8""#<#&8>#'$(c)-"7(6,3>3-()::\+*RY)E)+*RS(,+MMP)
LSN)G#;;#89)Z)d2()A85;)C)@2()G#;;#89)@)g2()a/#8$)d)32()c[59./#&8;)
=.&#:.") #$) Jkk\) @-.) C/>) '<) A&#.$>#<#&2) J'9:5>#$4(c) >/BC"&)D*,
E%&F*"5&!#,-"*55()J89K/#?4.()6$4;8$?()*QQ+)
LQN) F.9:">./() C2() `8#/?() [2() 8$?) =5K#$() F2) cf87#959) ;#T.;#-''?)
</'9)#$&'9E:;.>.)?8>8)H#8)>-.)6f)8;4'/#>-92(c)G7H"%/0,7I,!;*,27#/0,
9!/!&5!&(/0,97(&*!#J,9*"&*5,K(RQ%*!\*lRS2)*QOO)
L*MN) Z.>>#&-) A(2) ) a;8T.) J2`() 8$?) f./U) J2b2) cIJD) =.:'"#>'/B) '<)
98&-#$.) ;.8/$#$4) ?8>8K8"."c() *QQS)
L->>:\XX,,,2#&"25.?5Xm9;.8/$Xf`=.:'"#>'/B2->9;N)
Affine invariant DTW52
!"#$%&'#()*!+ ,-.+ *"/"-((0+ 1*%&'2)+ '.)+ -.'/3)&+
4"&1.5+ /3)+ %&',)!!6+ 7.4+ 18+ /3)+ !9"-&)+ '8+ :",(14)-.+
41!/-.,)+ 1!+ "!)4+ /'+ 1.+ /3)+ !),'.4+ !"#$%&'#()*+ /'+
,-(,"(-/)+!;"#<+ $%=<+ /3)+ ,'.2)&5).,)+'8+ /31!+*)/3'4+ ,-.+
#)+ -.-(0>)4+ #0+ -+ !1*1(-&+ ?-0+ '8+ :@%),/-/1'.$
A-@1*1>-/1'.+ ;:A=+ -(5'&1/3*+ BCD6+ E3)+ 4)/-1(!+ '8+ '"&+
-(5'&1/3*+-&)+-!+8'(('?!F+
Optimization Algorithm of AI-DTW:
G.1/1-(1>)++
E3)+?-&%1.5+%-/3+&;H=I'()*+,(-;(<+.=6+
G/)&-/1'.+."*#)&+/IH+
J31()+.'/+,'.2)&5).,)+
++++++++++++/+I/KHL+
++++++++++++M%4-/)+/3)+/&-.!8'&*-/1'.+*-/&1@+#0F+
++++++++++++++++++++++ ; H=
N; =
; =H
-&5*1.O P/
0/
# & #, #
, " $ ,!
"
" !# +6++
M%4-/)+/3)+?-&%1.5+%-/3+#0F+
&;/=I'()*+,(-;(<+.,;/==6+
:.4+J31()++
Q'*)+ )@-*%()!+ '8+ /3)+ -#'2)+ -(5'&1/3*+ -&)+
1(("!/&-/)4+ 1.+R15"&)+H<+-.4+'.)+,-.+ 81.4+ /3-/+ /3)+*'!/+
)&&'&+ &)4",/1'.+ 1!+ 4'.)+ 1.+ /3)+ 81&!/+ 1/)&-/1'.!+ -.4+ /3)+
7G$SEJ+,-.+"!"-((0+,'.2)&5)+1.+-+8)?+1/)&-/1'.!6+++
3.3 Rotation, Scale and Shift Invariant DTW +
+++G.+*-.0+N$41*).!1'.+!3-%)+-.-(0!1!+-%%(1,-/1'.!<+
!",3+-!+'.(1.)+3-.4?&1/1.5<+1/+1!+'8/).+.),)!!-&0+/'+"!)+
&'/-/1'.+ !,-()+ -.4+ !318/+ 1.2-&1-./+ ;TQQ+ 1.2-&1-./+ 8'&+
!3'&/=+*)/3'46+T'/-/1'.+!,-()+-.4+!318/+,-.+#)+!)).+-!+
!%),1-(+,-!)!+'8+-881.)+/&-.!8'&*6+U)&)<+?)+41!,"!!+/31!+
%&'#()*+!)%-&-/)(0+8'&+/3)+8'(('?1.5+&)-!'.!F+
;H= E3)+ TQQ+ 1.2-&1-./+ %&'#()*+ ,-.+ #)+ !'(2)4+ *'&)+
)881,1)./(0+?1/3'"/+/3)+*-/&1@+,-(,"(-/1'.+1.+:9+;C=6+
;N= G.+ !'*)+ -%%(1,-/1'.!<+ /3)+ 7G$SEJ+ *-0+ '2)&$81/+
/3)+ 1.%"/+ !)9").,)6+ R'&+ )@-*%()+ 1.+ /3)+ '.(1.)+
3-.4?&1/1.5<+ /3)+ !-*%()!+ '8+ 4151/+ VNV<VWV+ ,-.+ #)+
/&-.!8'&*)4+!3-%)!+2)&0+ !1*1(-&+ /'+4151/+ VHV<+ 18+?)+
?&1.X()+ /3)+ !-*%()!+ -('.5+1+ ,''&41.-/)6+E3"!+ 8'&+
/3)!)+)@-*%()!<+ 1/+ 1!+*'&)+.-/"&-(+ /'+ !)/+ /3)+ !,-()+
&-/)!+'8+1+,''&41.-/)+-.4+2+,''&41.-/)+-!+/3)+!-*)6+
Y12).+-+%'1./+ ;1<+2=<+ /3)+,''&41.-/)!+ -8/)&+ &'/-/1'.<+
!,-()+-.4+!318/+/&-.!8'&*-/1'.+,-.+#)+,-(,"(-/)4+#0F+
13I+$,'!;!=+1K+$!1.;!=2K!1+
23I$$!1.;!=+1K+$,'!;!=2K!2<44444444++++++++++;HZ=++?3)&)+$+4).'/)!+/3)+!,-()+&-/)<+!+1!+/3)+&'/-/1'.+-.5()<+
-.4+ !1<+ !2+ &)%&)!)./!+ /3)+ !318/+ %-&-*)/)&!6+ R'&+
!1*%(1,1/0<+?)+4).'/)+/3)+/&-.!8'&*-/1'.+1.+:96+;HZ=+#0+
;13<+ 23=I5;1<+ 2=<+ -.4+ -!!"*)+ /3-/+ )-,3+ )()*)./+ "#I+ ;1"6#<+
2"6#=+ 1!+ -+ N$41*).!1'.+ 2),/'&+ ,'*%'!)4+ #0+ /3)+
,''&41.-/)!+ + '8+ %'1./+ "#6+ Q1*1(-&(0+ $%I+ ;1$6#<+2$6#=6+Y12).+
?-&%1.5+ %-/3+ &<+ /3)+ 81&!/+ !"#+ '%/1*-(+ %&'#()*+
41!,"!!)4+1.+Q),/1'.+[6N+,-.+#)+&)$8'&*"(-/)4+-!F+
#"
!"0
#
#N$
$5".(9H
N
=;<<<
=;=<;*1.-&5$
6+++++++++++;HH=+
\0+ !'(21.5+ /3)+ 8".,/1'.!+ "9]"$IZ<+ + "9]"!IZ<4
"9]"8IZ<4"9]"7IZ<+?)+3-2)F+
H/-. ; =!
:$
!" <+! ;!:$ ]NN
%" <+
! "! #! $! %!
!
"!
#!
$!
%!
&!!
&"!
'()*+,-
./! ! /! &!!&!
"!
0!
#!
/!
$!
1!
%!
2!
&!!
&&!'()*+,3,45567"$62!"%
! "! #! $! %!
!
"!
#!
$!
%!
&!!
&"!
'+56,&,455671602&/
! "! #! $! %!
!
"!
#!
$!
%!
&!!
&"!
'+56,#,45567/60%#&
! / &!!
/
&!
&/
"!
"/
0!
'+458+9:(
;55:5
++
! "! #! $! %!
!
"!
#!
$!
%!
&!!
&"!
'()*+,-
"! #! $! %!!
"!
#!
$!
%!
&!!
&"!'()*+,3 ,455670$62!1/
! "! #! $! %!
!
"!
#!
$!
%!
&!!
&"!
'+56,&,4556726%!2%
! "! #! $! %!
!
"!
#!
$!
%!
&!!
&"!
'+56,#,45567$6%!00
! / &!!
/
&!
&/
"!
"/
0!
'+458+9:(
;55:5
+R15"&)+ H+ :@-*%()!+ '8+ 7G$SEJ6+ E'%+ &'?F+ 4151/+ N6+ \'//'*+ &'?F+ 4151/+ ^6+ _)8/+ ,'("*.F+ G.%"/+ %-//)&.+ EL+ N.4+
,'("*.F+G.%"/+%-//)&.+T+-8/)&+&'/-/1'.6+[&4+,'("*.F+%-//)&.+T+-8/)&+ /3)+81&!/+ 1/)&-/1'.+'8+7G$SEJ6+`/3+,'("*.F+
%-//)&.+T+-8/)&+/3)+`/3+1/)&-/1'.L+T153/+,'("*.F+:&&'&+,"&2)+-('.5+1/)&-/1'.+."*6+
Yu Qiao and Makoto Yasuhara. 2006. Affine Invariant Dynamic Time Warping and its Application to Online Rotated Handwriting Recognition. In Proceedings of the 18th International Conference on Pattern Recognition - Volume 02 (ICPR '06), Vol. 2. IEEE Computer Society, Washington, DC, USA, 905-908. DOI=10.1109/ICPR.2006.228 http://dx.doi.org/10.1109/ICPR.2006.228
Multi-Dimensional Dynamic Time Warping
¨ Comment appliquer DTW quand plusieurs grandeurs sont mesurées à chaque instant?
¨ Modification de la fonction de distance: ¤ Chaque dimension est normalisée avec une moyenne de
0 et une variance de 1 ¤ Calcul de la somme des valeurs absolues des
différences suivant chaque dimension
53
G. A. ten Holt, M. J. T. Reinders and E. A. Hendriks. Multi-dimensional dynamic time warping for gesture recognition. In Proceedings of the Thirteenth annual conference of the Advanced School for Computing and Imaging, 2007.
Multi-Dimensional Dynamic Time Warping54
Figure 3: The necessity for MD-DTW. (a) shows two artificial 2D time series of equal mean and variance. Di-mension 1 is shown in column 1, dimension 2 in column 2. The series contain synchronisation information inboth dimensions. If 1D-DTW is performed using the first dimension, the result is suboptimal for dimension 2, asillustrated in (b). Note that the peaks and valleys in dimension 2 are not aligned properly. 1D-DTW on dimension2 gives a similar suboptimal match for dimension 1. MD-DTW takes both dimensions into account and finds thebest synchronisation (c).
result. MD-DTW takes both dimensions into accountin finding the optimal synchronisation. The result isa synchronisation that is as ideal as possible for bothdimensions, as shown in figure 3 (c). This is the ad-vantage of MD-DTW over regular DTW.
Though the series in figure 3 is artificial, the sit-uation it depicts is not unrealistic. Figure 4 shows asimilar situation from a real-world multi-dimensionalseries: x- and y-co-ordinates of the right hand for theDutch sign for acrobat. Here, the y-co-ordinate is in-formative for the first part of the series, and the x-co-ordinate for the second part. To get both peaksproperly matched, MD-DTW is necessary.
2.2 DTW on Derivatives[4] argue that for synchronising shape-
characteristics of series (such as peaks and slopes),it is beneficial to perform DTW on the first-orderderivatives of the feature values (DDTW). Since wewant to perform such shape-matching in each of ourdimensions, we considered this option for the MD-DTW algorithm. In our case, this meant taking thefirst-order derivative in each dimension separately.This gives us information about the slopes and peaksin each dimension. The series were first smoothed
in each dimension with a Gaussian filter (� = 5) todiminish noise effects. Then an approximation of thederivative was taken in each dimension using the filterder(a(t)) = (a(t + 1)� a(t� 1))/2.
Now we can perform two types of MD-DTW: onthe feature values and on their derivatives. As a thirdtype, we took the derivatives and added them to theseries as extra dimensions, doubling the dimensional-ity of the series. In this setting, both the feature valuesthemselves and their derivatives are taken into consid-eration when searching for the ideal warp. We testedour algorithm in all three settings. They are denotedas S(ignal), D(erivative) and SD(signal+derivative).In each case, the series were smoothed and nor-malised before (MD)DTW was commenced.
2.3 Dimension SelectionTo compare series with various dimensions, it is
necessary to normalise the dimensions. However, thisis a problem if a dimension contains only noise. Forexample, position data gathered on the non-dominanthand in a one-handed gesture. Normalisation willenlarge the noise in this dimension to the same pro-portions as the informative data in other dimensions(such as dominant hand positions). The noise will
Mutli-Dimensional Dynamic Time Warping
¨ Possibilité d’utiliser les signaux et leur dérivée55
TP56