7
Simultaneous Tracking and Shape Estimation with Laser Scanners Markus Sch¨ utz * , Nils Appenrodt , J¨ urgen Dickmann , Klaus Dietmayer § * Daimler AG, Ulm, Germany Email: [email protected] Daimler AG, Ulm, Germany Email: [email protected] Daimler AG, Ulm, Germany Email: [email protected] § University of Ulm, Germany Email: [email protected] Abstract—Advanced driver assistance systems and the envi- ronment perception for autonomous vehicles will benefit from systems robustly tracking objects while simultaneously estimating their shape. Unlike many recent approaches that represent object shapes by approximated models such as boxes or ellipses, this paper proposes an algorithm that estimates a free-formed shape derived from raw laser measurements. For that purpose local occupancy grid maps are used to model arbitrary object shapes. Beside shape estimation the algorithm keeps a stable reference point on the object. This will be important to avoid apparent motion if the observable part of an object contour changes. The algorithm is part of a perception system and is tested with two 4-layer laser scanners. I. I NTRODUCTION Today’s dense traffic situation requires more and more the support of advanced driver assistance systems. An improved environment perception, especially the detection and tracking of the surrounding objects, becomes crucial for multiple safety applications. Tracking algorithms, that are designed to estimate the dynamic states of measured objects, are a well-explored field in literature. The common approach is to use the Gaussian approximation to model the uncertainty of the state and use a Gaussian filter for estimation, e.g. a standard or an extended Kalman filter. Those filters are generally used to infer dynamic states of the objects, such as the position and the velocity of objects, from noisy measurements. In those tracking systems a single measurement updates a track. Since a laser scan provides usually several reflexions per object, it is crucial to find a suitable common reference point for all these reflexions that represents the object’s dynamic state best. A common approach is to use the mean value of these reflexions as a measurement to update the track. Fig. 2 shows a major problem of those approaches: the moving reference point that leads to an apparent motion when the observable part of an object contour changes e.g. when the perspective changes from that an object is seen. More advanced approaches search for stable features of an object such as corners and use one of these as the reference point (that must be able to change suddenly when the object turns in a way that this corner is not visible anymore). These model-based approaches are mostly used for a specific kind of object, such as cars. For example box-models are often used to describe the shape of a car. Beside the use of the corners as stable features, this model also allows to estimate the size of the vehicle. Often, this box-model assumption is a suitable approximation. However, there exist scenarios in which a more accurate representation of the shape is important, especially when no specific types of objects can be expected, but accurate shapes have to be estimated. An important application for this requirement are pre- crash scenarios. Here it is crucial to detect if an accident is unavoidable. A bounding box model easily overrates the danger in such cases and produces false alarms. Furthermore, a tracking system that is not restricted to cars and its typical shape is eligible. The presented approach in this paper aims to tackle these two problems: on the one side keeping a stable reference point to achieve accurate tracking results and on the other side estimating a detailed shape that allows to overcome limitations caused by abstract approximations. To achieve these goals, we use occupancy grid maps to store associated measurements and represent the shape of the object. The occupancy grids are normally used to map the static environment of a vehicle or robot. In contrast to this application the occupancy grids in the presented approach are fixed to the objects, i.e. they are object-local. This approach entails the limitation that the shape of the objects has to be static while tracking. The major benefit is the complexity of shapes that can be accurately modeled. Furthermore, a fixed anchor point on the shape is available that prevents the errors caused by a moving reference point as shown in Fig. 2. Two Sick LD-MRS400001 laser scanners, mounted at the front of our test car (see Fig. 1), provide the input data for the developed algorithm. Cooperative traffic (e.g. Vehicle-2-vehicle communication) is not considered since the system is designed to work with generic shaped objects that does not necessarily provide this data (bicycles, pedestrians, common vehicles). Also a highly accurate inertial navigation platform fused with a D-GPS

Simultaneous Tracking and Shape Estimation with Laser Scannersfusion.isif.org/proceedings/fusion2013/html/pdf... · vehicle. The laser scanners has four layers with a vertical field

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Simultaneous Tracking and Shape Estimation with Laser Scannersfusion.isif.org/proceedings/fusion2013/html/pdf... · vehicle. The laser scanners has four layers with a vertical field

Simultaneous Tracking and Shape Estimation withLaser Scanners

Markus Schutz∗, Nils Appenrodt†, Jurgen Dickmann‡, Klaus Dietmayer§∗Daimler AG, Ulm, Germany

Email: [email protected]†Daimler AG, Ulm, Germany

Email: [email protected]‡Daimler AG, Ulm, Germany

Email: [email protected]§University of Ulm, Germany

Email: [email protected]

Abstract—Advanced driver assistance systems and the envi-ronment perception for autonomous vehicles will benefit fromsystems robustly tracking objects while simultaneously estimatingtheir shape. Unlike many recent approaches that represent objectshapes by approximated models such as boxes or ellipses, thispaper proposes an algorithm that estimates a free-formed shapederived from raw laser measurements. For that purpose localoccupancy grid maps are used to model arbitrary object shapes.Beside shape estimation the algorithm keeps a stable referencepoint on the object. This will be important to avoid apparentmotion if the observable part of an object contour changes. Thealgorithm is part of a perception system and is tested with two4-layer laser scanners.

I. INTRODUCTION

Today’s dense traffic situation requires more and more thesupport of advanced driver assistance systems. An improvedenvironment perception, especially the detection and trackingof the surrounding objects, becomes crucial for multiple safetyapplications.

Tracking algorithms, that are designed to estimate thedynamic states of measured objects, are a well-explored fieldin literature. The common approach is to use the Gaussianapproximation to model the uncertainty of the state and use aGaussian filter for estimation, e.g. a standard or an extendedKalman filter. Those filters are generally used to infer dynamicstates of the objects, such as the position and the velocity ofobjects, from noisy measurements. In those tracking systemsa single measurement updates a track. Since a laser scanprovides usually several reflexions per object, it is crucial tofind a suitable common reference point for all these reflexionsthat represents the object’s dynamic state best. A commonapproach is to use the mean value of these reflexions as ameasurement to update the track. Fig. 2 shows a major problemof those approaches: the moving reference point that leadsto an apparent motion when the observable part of an objectcontour changes e.g. when the perspective changes from thatan object is seen. More advanced approaches search for stablefeatures of an object such as corners and use one of theseas the reference point (that must be able to change suddenlywhen the object turns in a way that this corner is not visibleanymore). These model-based approaches are mostly used for

a specific kind of object, such as cars. For example box-modelsare often used to describe the shape of a car. Beside the use ofthe corners as stable features, this model also allows to estimatethe size of the vehicle. Often, this box-model assumption isa suitable approximation. However, there exist scenarios inwhich a more accurate representation of the shape is important,especially when no specific types of objects can be expected,but accurate shapes have to be estimated.

An important application for this requirement are pre-crash scenarios. Here it is crucial to detect if an accidentis unavoidable. A bounding box model easily overrates thedanger in such cases and produces false alarms. Furthermore,a tracking system that is not restricted to cars and its typicalshape is eligible.

The presented approach in this paper aims to tackle thesetwo problems: on the one side keeping a stable referencepoint to achieve accurate tracking results and on the other sideestimating a detailed shape that allows to overcome limitationscaused by abstract approximations.

To achieve these goals, we use occupancy grid maps tostore associated measurements and represent the shape of theobject. The occupancy grids are normally used to map thestatic environment of a vehicle or robot. In contrast to thisapplication the occupancy grids in the presented approach arefixed to the objects, i.e. they are object-local.

This approach entails the limitation that the shape of theobjects has to be static while tracking. The major benefit isthe complexity of shapes that can be accurately modeled.Furthermore, a fixed anchor point on the shape is availablethat prevents the errors caused by a moving reference point asshown in Fig. 2.

Two Sick LD-MRS400001 laser scanners, mounted at thefront of our test car (see Fig. 1), provide the input data for thedeveloped algorithm.

Cooperative traffic (e.g. Vehicle-2-vehicle communication)is not considered since the system is designed to work withgeneric shaped objects that does not necessarily provide thisdata (bicycles, pedestrians, common vehicles). Also a highlyaccurate inertial navigation platform fused with a D-GPS

Page 2: Simultaneous Tracking and Shape Estimation with Laser Scannersfusion.isif.org/proceedings/fusion2013/html/pdf... · vehicle. The laser scanners has four layers with a vertical field

Fig. 1. Test vehicle equipped with sensors used for the perception systemand for evaluation purposes.

Direction of movement

Fig. 2. Example of the problem that arises when using the mean point ofthe laser measurements as reference position while tracking

signal is only used for evaluation purposes, since it is onlysuitable for research vehicles.

II. RELATED WORK

Different attempts have been made in literature to combineoccupancy grid maps and tracking algorithms on an objectlevel. While occupancy grid maps, first developed by Elfes [1],are a good approach to model static environments on a rawlevel of abstraction, tracking algorithms estimate the dynamicstate of moving objects on a high level.

The combination of these two models is mostly achievedby making the grid cells dynamic, e.g. by storing velocitydata in each grid cell. Examples for these kind of approachescan be found in Coue et al. [2], where a 4-D occupancy gridmap (position and velocity in 2-D) is developed. The besidethe occupation of each cell, the velocity components can beestimated. Also, Danescu et al. [3] shows an approach wherethe occupation within each cell consists of particles that canmove from cell to cell. Tracking is based on the estimatedparticle motion.

Instead of a global grid map with dynamic informationcontained in its cells, the approach we propose aims to movethe whole grid assigned to an object. The following works alsobuild up a shape or map that is moved over time. Effertz [4]uses a local map to achieve the contour of an object. Insteadof integrating the associated raw sensor data, the data isabstracted and the resulting polygon is integrated into a map.This approach needs an additional association step since afeature map is created. Steinemann et al. [5] use a highlyaccurate Velodyne HDL-64e 3D-laser scanner to accumulate alocal map of a passenger car in 3D. Several deterministic cost

functions are combined to determine the best object estimate.Moosmann et al. [6] accumulate a local point cloud using thewell-known optimization algorithm ICP. New sensor data ismatched to already accumulated data of the object from thepast in a so-called refinement step. The resulting translationand rotation of this refinement step is used as an input to theKalman filter update step.

III. PERCEPTION SYSTEM

The tracking system is integrated in the perception systempresented at [7].

The tracking is accomplished in the vehicle coordinatesystem which has its origin at the center of the rear wheelaxis, with the x-axis to the front and the y-axis to the left ofthe vehicle.

A localization module based on a global occupancy mapimproves the proprioceptive measurements provided by the egovehicle. The laser scanners has four layers with a vertical fieldof view of 3.2◦ between the topmost and lowermost layer. Notall of them can be used, because of the pitch and roll motion ofthe ego vehicle while driving and the unevenness of the road.Because this lack of 3D-information, the tracking is performedin two dimensions.

The measurements used to track the objects first have tobe distinguished from non-relevant data. Literature does notregard the extraction of dynamic objects as trivial, howeverseveral approaches can be found to cope with the detection ofdynamic objects, e.g. [8], [9], [10].

This work assumes that these measurements caused by anobject are associated correctly. However the algorithm tendsto be robust against association error to a certain degree.For experiments the global nearest neighbor association wasapplied after segmenting the data with the DBSCAN algorithmin polar coordinates.

IV. MATHEMATICAL APPROACH

Each object can be thought of to be divided into two parts.On the one hand a dynamic state, which is similar to the stateof a standard tracking algorithm. It describes the state vectorthat usually changes over time, such as the position and thevelocity. This part will be described in IV-A. On the other handthere is the static part, consisting of the shape and the offsetof the anchor point and the center of gravity of the object.Static in this context means that the true shape and offset ofthe real object are constant, i.e. do not change over time, butare initially unknown and have to be estimated. This part isdescribed in IV-B.

A. Dynamics Model

As stated in Sec. IV the modeling of a tracked object isdivided into a dynamic state representation and a static shaperepresentation.

The dynamic part consists of the following state variables

Xk = [ xk yk Ψk vk ωk ]T (1)

Page 3: Simultaneous Tracking and Shape Estimation with Laser Scannersfusion.isif.org/proceedings/fusion2013/html/pdf... · vehicle. The laser scanners has four layers with a vertical field

Ck

Ak

cy,k

cx,k

vk

y

x

Ψk

xk

yk

xk

yk

Fig. 3. The figure shows the vehicle model. It is divided into a dynamicstate and a static occupancy grid map. Red denotes the offset from the anchorpoint to the dynamic reference point.

where k denotes the current time step. xk, yk and Ψk referto the anchor point’s position and orientation of the vehicle inthe 2-D plane. vk is the velocity in the movement direction Ψk

at the center of gravity (see Fig. 3). Orientation and directionof the velocity coincide in this model. Its yaw-rate is denotedby ωk.

A Constant-Turn-Rate/Constant-Velocity-model is used topredict the motion of the object. Since the anchor coordinatesxk and yk can eventually be positioned in arbitrary relationshipto the object, they cannot serve as the representation point ofthe motion. To undertake the prediction, the position of thecenter of gravity has to be calculated, by recalculating it usingthe offset values cx,k and cy,k that are also estimated by thesystem (see Sec. IV-B2). Since the offset values are kept inobject local coordinates, the calculation of the center of gravityis defined by:

[xkyk

]=

[xkyk

]+

[cos Ψk − sin Ψk

sin Ψk cos Ψk

] [cx,kcy,k

](2)

The actual prediction is then performed by evaluating thefollowing equations if it can be assumed that ωk > 0

xk+1 = xk +vkωk

(sin (Ψk + ωk∆T )− sin (Ψk)) (3)

yk+1 = yk +vkωk

(cos (Ψk)− cos (Ψk + ωk∆T )) (4)

Ψk+1 = Ψk + ωk∆T (5)vk+1 = vk (6)ωk+1 = ωk (7)

For very small ωk the model degenerates to a constant-velocity model.

xk+1 = xk + vk∆T cos (Ψk) (8)yk+1 = yk + vk∆T sin (Ψk) (9)

Ψk+1 = Ψk + ωk∆T (10)vk+1 = vk (11)ωk+1 = ωk (12)

After the evaluation the anchor point is calculated byinverting Eq. (2).

Additionally the states are ego motion compensated, i.eany motion resulting from the moving ego vehicle is removed.The position and orientation are represented in the ego vehiclecoordinate system, the velocity and yaw-rate are representedover ground.

B. Shape Model

Object shapes can differ in many aspects and generally canbe complex.

In our approach the object’s shape model Sk = (Mk, Ck)consists of a local occupancy grid map

Mk = {m1,m2, ...,mN} (13)

consisting of grid cells mi and an offset vector

Ck = [ cx,k cy,k ]T (14)

that is described in Sec. IV-B2.

1) Occupancy Grid Map: The shape of the object at timek is represented by the occupancy map Mk.

Shape in this context means the contour of an object seenfrom the top. For automotive applications the height of anobject is not important.

The mapping approach allows various free-form shapes,while not demanding more computational resources with in-creasing complexity of the object shape. Predefined featuresin the model are not required since the raw sensor data isaccumulated.

Major drawback is that shapes have to be inflexible whiletracking, i.e. the shape is assumed to be static. However thealgorithm can adapt to small changes over time.

In a occupancy grid map, a 2D-space (normally the envi-ronment of a the vehicle) is divided into equally sized cells.Each cell is a probabilistic variable describing if the occupationof this cell and is independent from its neighbors.

In this approach the map is used in the same way withthe difference that the origin of the map is fixed to car at thecoordinates xk and yk of the object and it is rotated by Ψk.

Assuming Mk is the grid map at time k and mi denotesa grid cell with index i (i = 1..N ), and the grid cells areindependent to one another, the occupancy grid map can bemodeled as a posterior probability:

P (Mk|z1:k, X1:k) =∏i

P (mi|z1:k, X1:k) (15)

Page 4: Simultaneous Tracking and Shape Estimation with Laser Scannersfusion.isif.org/proceedings/fusion2013/html/pdf... · vehicle. The laser scanners has four layers with a vertical field

where P (mi|Z1:k, X1:k) is the inverse sensor model thatdescribes the probability of occupation given the measurementsZ1:k and the dynamic object state X1:k at all time steps untiltime k. Each measurement consists of n reflexion points Zj ={z1,j , z2,j ...zn,j}. As stated at the beginning of section IV itis assumed that the measurements Z1:k induced by the objectare correctly segmented and associated.

The occupancy values of each cell are calculated by abinary Bayes filter. In practice, the log odds ratio is used to in-tegrate new measurements efficiently. The log odds occupancygrid map is formalized as:

Lk(mi) = logP (mi|Z1:k, X1:k)

1− P (mi|Z1:k, X1:k)(16)

The recursive formulation of map update in log odds ratiois given by [11]:

Lk(mi) = Lk−1(mi) + logP (mi|Z1:k, X1:k)

1− P (mi|Z1:k, X1:k)− L0(mi)

(17)

where Lk−1(mi) and L0(mi) are the previous and prior logodds values of grid cell i.

Instead of performing multiplications, the usage of thelog odds ratio simplifies the calculation to additions andavoids instabilities of calculating probabilities near zero orone. Assuming that no prior knowledge is available, the priorprobability of unknown cells is set to P0(mi) = 0.5, the aboveequation produces the prior log odds ratio L0 = 0.

To obtain the corresponding probability from the log oddsrepresentation again, Eq. (16) can be inverted.

2) Offset: As stated in IV-A, (xk, yk) describes a fixedanchor point on the tracked object that does not change overtime. While this is desirable in order to avoid apparent motion,the direct use of this point as the dynamic reference point is nota good choice. The best reference point for arbitrary objectsis the center of gravity.

So, to apply the motion dynamics, cx,k and cy,k describeoffsets between that fixed point and the estimated center ofgravity of the object (see also Fig. 3) in object coordinates,with the x-axis in the movement direction and the y-axis tothe left.

C. Measurement Model

This section describes how we obtain the measurementlikelihood p(Zk|Xk, Sk) given the object state Xk, the object’sshape Sk and the current laser data Zk that was associated withthe object. Although possible, we currently do not consider therays from the scanner to the laser reflexion centers and markthe space in between the laser scanner and the reflexion asfree space, as it is commonly done. This is mainly due to theheavy computational load the tracing of each ray would imply.

Also as seen in Sec. VI-B this assumption is not alwaysvalid. However it is planned to evaluate possible benefits inthe future.

The measurement likelihood is computed by factoring itunder the assumption of independent laser measurements.

Fig. 4. Local occupancy grid map that represents the shape of the objects.The brightness of a cell shows the probability of its occupation, where blackstands for a high probability and light gray stands for a probability of 0.5,which means ”unknown”. Red points emblematize laser measurements, redellipses the uncertainty of the measurement.

p(Zk|Xk, Sk) =∏i

p(zi,k|Xk, Sk) (18)

The likelihood of each observation zi, which is conditionedon the state Xk and the shape Sk, is computed based on theprobability of hitting an occupied cell.

p(zi,k|Xk, Sk) = eαO(zi) (19)

α denotes a constant, that was set manually (we use0.01 < α < 0.05). O(zi) denotes the measurement’s coveredoccupancy in Mk given the dynamic state Xk. It is defined fora occupancy map by:

O(zi) =

∫Mk

P (Mk|zi, Xk)P (Mk) (20)

and can be approximated for the grid cells mj by:

O(zi) =

N∑j=0

P (mj |zi, Xk)P (mj) (21)

D. Inference

The resulting distribution is multi-modal. Occlusion ofparts of the object result in ambiguities that can only beresolved over time (see Fig. 5).

Standard fitting algorithms like the ICP (iterative closestpoint) only perform well in absence of such ambiguity. Testswith simulated data have shown that an ICP algorithm, wherenew scans are registered to map cells, only work if themeasurement data contains very low or no noise.

Furthermore registering scan data without an extendedsensor model (e.g. by integrating free-space), is not able to

Page 5: Simultaneous Tracking and Shape Estimation with Laser Scannersfusion.isif.org/proceedings/fusion2013/html/pdf... · vehicle. The laser scanners has four layers with a vertical field

Fig. 5. Partially occluded car. White color denotes the area that can beobserved by the sensor. Gray color denotes occluded areas or areas that arenot of the field of view. The red measurements can not resolve the ambiguityof the car position.

handle objects that are leaving or entering the field of view oran occluded area.

The motion model presented in Sec. IV-A is nonlinear.

To take these conditions into account a particle filter forthe Bayesian estimation is used and methods like the EKF(Extended Kalman Filter) or UKF (Unscented Kalman Filter)are not suitable.

For particle filtering the condensation algorithm proposedin [12] is used.

Overall, we have to estimate Xk =[xk yk Ψk vk ωk]

T , and Sk = (Mk, Ck), whichmeans several hundred states have to be estimated forcomparatively big objects and a appropriate cell size of themap.

The number of particles needed grows exponentially withthe number of dimensions of the estimation problem, while thedimensionality of the presented problem is very high. Becauseof that the estimation problem is split.

The dynamic states x, y, ψ, v, ω are estimated via sampling,to use the multi-modal properties of the particle filters. Tosimplify the estimation of the very high-dimensional Sk, itis split up into independent cells, where each cell is now abinary problem. This approach is called Rao-Blackwellisationand was introduced in [13].

So each particle qjk consists of a dynamic state and a shape:

qjk = (Xjk, S

jk) (22)

The weights wjk of the particle are drawn according to

wjk = wjk−1P (Zk|Xjk, S

jk) (23)

Currently the offset is modeled as a deterministic variable,i.e. it is not estimated in a probabilistic manner.

It is difficult to detect the completeness of the shape seenso far and it is impossible to extract the weight distribution ofthe real car from the laser measurement. For this reason thereis no valid foundation to assume a measurement uncertaintyto use for example a Kalman filter to estimate the parameter.

Instead the best guess is to extract a mean point of theoccupied cells. The occupied cells in Mk provide informationabout the center of gravity to update the offset vector Ck.We assume that the mean value of the occupied cells isalways the best guess for the center of gravity for a free-form object. So in each time step a bounding box is calculatedaround occupied cells whose occupation lies beyond a certainthreshold. The mean point of this bounding box is consideredas a measurement for the current offset to the center ofgravity Ck. To smooth this output position a moving averagecalculation over the last L measurements is performed byapplying

Ck =1

L

L∑i=1

Ck−L+i (24)

V. OBJECT TRACKING

The environment perception framework presented in [7]integrates the object tracking algorithm presented in this pa-per. Modules for the detection of moving objects and theassociation of new segmented measurements are available. Alocalization module provides the velocity and yaw-rate of theego-vehicle.

The object tracking itself works according to the followingscheme.

After the detection module provides data of a new object,it is initialized via a two step-initialization to obtain an initialvelocity v0 and an initial orientation Ψ0. The initial statesof the particles are sampled from a Gaussian prior with themean value of v0 and Ψ0. The variances of the priors dependon accuracy of the initialization. They are obtained usingan experimental method. The accuracy of the initializationdirectly influences the number of particles that have to be used.The better the initial velocity and orientation the smaller thevariances can be set and the fewer particles are needed at thebeginning.

The anchor point is set to the mean value of the associatedlaser scanner reflexions in x- and y-direction. With each newscan the object state is predicted via the model described insection IV-A. If new measurements could be associated withthe object, the shape grid map Mk is first used to evaluatethe weight of the particle. After that, Mk is updated using themeasurement model from section IV-C. From the shape, wederive a “measurement” for the estimation of the center pointCk, and update it.

At the moment, we do not use any logic to merge twoobjects, since we found that this scenario was rare in ourdata, although this fact heavily depends on the segmentationalgorithm that is used.

Page 6: Simultaneous Tracking and Shape Estimation with Laser Scannersfusion.isif.org/proceedings/fusion2013/html/pdf... · vehicle. The laser scanners has four layers with a vertical field

The main advantage of the proposed algorithm can also bea drawback in the context of applications that are using thetracking output.

The object shape represented by a grid map describes thereal shape on a raw level. However most follow-up modulesdepend on abstract shape descriptions such as boxes or another polygonal line. To make the algorithm compatible tothose requirements an abstract shape has to be extracted fromthe grid map. For that purpose the convex hull around cellswhose occlusion lie above a certain threshold. This approachrestricts the representable shape to convex shapes. The thresh-old depends on the inverse sensor model and the intended levelof a safety margin. The lower the threshold, the larger the areathe object covers.

VI. EXPERIMENTAL RESULTS

Two Sick LD-MRS400001 laser scanners, mounted at thefront of our test car provide the input data for the developedalgorithm. A Velodyne HDL-64e laser scanner is mounted atthe roof and is used for evaluation purposes only. Data wasrecorded at an open area with one target car and static obstaclesas well as in urban scenarios.

In the scenarios an oncoming vehicle is chosen exemplaryas the main tracked object since its considerable extendedshape, that differs from a box mainly at the front.

A. Scenario 1

Fig. 6 shows the local grid map of an oncoming Mercedes-Benz C-Class Estate turning to its rights in the field of view.White cells mean unknown areas, while black color denotesocclusion. The right side (which is the upper side in Fig. 6)of the car is detected for a very short period of time at thebeginning of the scenario. The back of the car is seen for ashort time at the end of the scene. This scenario is intended toshow the general performance of the algorithm in estimatingthe shape of an object.

The cell size is set to an edge length of 5 cm, which is veryfine grained compared to standard settings, where 10-20 cm areset. While this is computationally very demanding. It showswell the performance of the tracking and shape estimationalgorithm, since almost every area of the target cars surface ismeasured at some frames.

The estimated size that can be extracted from the shape gridmap depends on the threshold and the desired safety margin.For this scenario 100 particles are used.

B. Scenario 2

In the second scenario an oncoming vehicle is tracked.This scenario is more realistic, since oncoming traffic is verycommon in the real world. The proposed tracking algorithmis compared to a standard Kalman filter approach, wherethe mean point of the associated measurements is used as areference point. For both algorithms the borders of the field ofview are not modeled in the sensor model. So only associatedraw data is used as the algorithms’ input.

This scenario is intended to evaluate, beside the estimationof the shape of the object with a larger grid cell size, mainly

x [m]

y [m

]

−2 −1 0 1 2 3

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Fig. 6. Local grid map of an oncoming Mercedes-Benz C-Class Estate turningto its rights in the field of view. White cells denote unknown areas, while blackcolor denotes occlusion. The right side of the car is detected for a very shortperiod of time at the beginning of the scenario.

the algorithms characteristic to keep a stable reference pointwhen parts of the object are occluded. For that matter the targetvehicle leaves the field of view at some point, a situation thatresembles the disappearance behind another object.

The resulting grid map shown in Fig. 7 is typical for gridmaps of objects built in urban scenarios. The figure shows theshape grid map with cell size of 0.2m of an oncoming vehicleon left lane. Orange denotes occupied grid cells, black cells areof unknown occlusion. The red dot shown the anchor point ofthe grid, the cyan dot represents the reference point obtainedby adding the offset value.

The green line denotes the extracted abstract shape. Theextracted shape depends on the threshold that is set for theminimum occlusion of the cells that are used for the convexhull calculation. For critical applications, such as a pre-crashapplication, the threshold have to be set lower. The resultingextracted shape will be larger to reduce the rate of false positivecrash detections.

While oncoming the vehicle was occluded by a small objectbetween the lanes. Obviously this has not influenced the qualityof the map. The incomplete side of the vehicle’s shape istypical for oncoming vehicle with the laser scanners that areused. The steep angle in this area mitigates the reflexion of thelaser ray to the point where no detection is possible anymore.However, this does not affect the shape estimation algorithm.The tracking system robust against those effects, because dataon a raw level is integrated in the map. Care has to be takenwhen trying to integrate inverse sensor model with free-space,since that effect would violate the assumption of this model.

Fig. 8 shows the estimated velocity of the oncoming carover time by the two algorithms. Red and blue denote thevelocities estimated by the proposed approach respectively thestandard Kalman filter approach.

When the target vehicle leaves the field of view of thelaser scanner around 16.2s significant decrease of the velocityis estimated by the extended Kalman Filter. This effect canbe explained by taking into account Fig. 5. Once the vehiclereaches the border of the field of view the mean point of

Page 7: Simultaneous Tracking and Shape Estimation with Laser Scannersfusion.isif.org/proceedings/fusion2013/html/pdf... · vehicle. The laser scanners has four layers with a vertical field

y [m]

x [m

]

−3 −2 −1 0 1 2 3

−3.5

−3

−2.5

−2

−1.5

−1

−0.5

0

0.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Fig. 7. Shape grid map of an oncoming vehicle on left lane. Orange denotesoccupied grid cells, black cells are of unknown occlusion. The red dot showsthe anchor point of the grid, the cyan dot represents the reference pointobtained by adding the offset value.

the measurements moves with half the actual velocity in x-direction.

The estimated velocity of the proposed algorithm alsodecreases slightly. That is because the uncertainty increases.

A second effect that cannot be seen in the figure is thata velocity component in y-direction towards the ego-vehicleis estimated, since the front of the object disappears whilethe detectability of the side of the object rises. This leads toa movement towards the ego-vehicle and can lead to criticalfalse positives for crash detections.

14.6 14.8 15 15.2 15.4 15.6 15.8 16 16.2 16.4 16.620

22

24

26

28

30

32

Time [s]

Velo

city [m

/s]

Grid Map−based Tracking

EKF Filter Tracking

Fig. 8. Estimated velocity of the oncoming car over time by the twoalgorithms. Red and blue denote the velocities estimated by the proposedapproach respectively the standard Kalman filter approach.

VII. CONCLUSIONS

We have presented a new concept for tracking dynamicextended objects while simultaneously estimating their shape.Experimental results with real sensor data showed promisingresults, although there are further investigations needed whenthere is more data with corresponding ground-truth available.The algorithm and the implementation has to be improved toreach real-time requirements by applying shared maps betweenparticles and parallelization of the particle calculations. Sincethe initialization is crucial to decrease the particle number a

radar-based initialization routine will be implemented. Further-more radar will be used to improve the tracking results byfusing laser scanner with radar data while especially taking intoaccount the shape estimated by the laser scanners. Probabilisticstate-of-the-art association algorithms will help to improve thetracking algorithms in crowded urban traffic scenarios.

ACKNOWLEDGMENT

The methods presented in this paper were developed withinthe integrated project interactIVe, a European research activitythat contributes to accident avoidance by active interventionfor Intelligent Vehicles.

The work is part of the research activities of the Daim-ler AG to develop novel environment perception systems.

REFERENCES

[1] A. Elfes, “Using occupancy grids for mobile robot perception andnavigation,” IEEE Computer, vol. 22, no. 6, pp. 46–57, 1989.

[2] C. Coue, C. Pradalier, C. Laugier, T. Fraichard, and P. Bessiere,“Bayesian occupancy filtering for multitarget tracking: an automotiveapplication,” The International Journal of Robotics Research, vol. 25,no. 1, pp. 19–30, 2006.

[3] R. Danescu, F. Oniga, and S. Nedevschi, “Modeling and tracking thedriving environment with a particle-based occupancy grid,” IntelligentTransportation Systems, IEEE Transactions on, vol. 12, no. 4, pp. 1331–1342, 2011.

[4] J. Effertz, “Autonome fahrzeugfuhrung in urbaner umgebung durchkombination objekt- und kartenbasierter umfeldmodelle,” Ph.D. dis-sertation, Technischen Universitt Carolo-Wilhelmina zu Braunschweig,2009.

[5] P. Steinemann, J. Klappstein, J. Dickmann, F. von Hundelshausen, andH.-J. Wnsche, “Geometric-model-free tracking of extended targets using3d lidar measurements,” Laser Radar Technology and ApplicationsXVII, vol. 8379, pp. 83 790C–83 790C–12, 2012.

[6] F. Moosmann and T. Fraichard, “Motion estimation from range imagesin dynamic outdoor scenes,” in Proceedings of the IEEE InternationalConference on Robotics and Automation, Anchorage, Alaska, USA,May 2010, pp. 142–147.

[7] M. Schutz and K. Dietmayer, “A flexible environment perceptionframework,” in Advanced Microsystems for Automotive Applications2013, G. Meyer, Ed. Springer Berlin Heidelberg, 2013.

[8] C.-C. Wang, C. Thorpe, and S. Thrun, “Online simultaneous localizationand mapping with detection and tracking of moving objects: theory andresults from a ground vehicle in crowded urban areas,” in Roboticsand Automation, 2003. Proceedings. ICRA ’03. IEEE InternationalConference on, vol. 1, sept. 2003, pp. 842 – 849 vol.1.

[9] D. F. Wolf and G. S. Sukhatme, “Mobile robot simultaneous localizationand mapping in dynamic environments,” AUTONOMOUS ROBOTS,vol. 19, pp. 53–65, 2005.

[10] M. Schutz, Y. Wiyogo, M. Schmid, and J. Dickmann, “Laser-basedhierarchical grid mapping for detection and tracking of moving objects,”in Advanced Microsystems for Automotive Applications 2012, G. Meyer,Ed. Springer Berlin Heidelberg, 2012, pp. 167–176.

[11] S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics (IntelligentRobotics and Autonomous Agents series), ser. Intelligent robotics andautonomous agents. The MIT Press, Aug. 2005.

[12] M. Isard and A. Blake, “Condensationconditional density propagationfor visual tracking,” International Journal of Computer Vision, vol. 29,pp. 5–28, 1998, 10.1023/A:1008078328650.

[13] A. Doucet, N. De Freitas, K. Murphy, and S. Russell, “Rao-blackwellised particle filtering for dynamic bayesian networks,” inProceedings of the Sixteenth conference on Uncertainty in artificialintelligence. Morgan Kaufmann Publishers Inc., 2000, pp. 176–183.