Optoelectronic Stochastic Parallel Processors for real time image processing Groupe Physique des...

Preview:

Citation preview

Optoelectronic Stochastic Parallel Processors for real time image processing

Groupe Physique des images. Ph. Lalanne, J.C.Rodier et P.ChavelInstitut d’Optique Théorique et appliquée, CNRS URA 14

A.Dupret, E.Belhaire and P.GardaInstitut d’Electronique Fondamentale, CNRS URA 22.

Alvaro CASSINELLI

I.Glaser and A.A.Friesem Weizmann Institute, Physics of Complex Systems.

2Plan

I) Low level image processing

Mathematical Model (energy function)

Combinatorial problem and computational load (stochastic optimization).

II) Parallel Processor Implementation

First prototype on CMOS : SPIE600 (24x25)

Noise cleaning demonstration using SPIE600

Demonstrator enhanced with Optical Interconnections

Motion Detection Demonstrator

III) Modeling of an hybrid Silicon/AsGa Parallel Processor

Monolithic convolution : proposed architecture

Performance evaluation

IV) Further research

I) Optimization and Low-level image processing

Bayesian approach to image restoration

Optimisation problem

4

Inverse problem :

Bestimation...

observation...

A?

…there are too many possible solutions

…introducing a priori knowledge enables us to consider only the more probable !

)()/().()/(

BPxBPxPBxP

This is a “probabilistic regularization" technique (Bayesian restoration)

a priori term:- Favor smooth images(MRF field)

Likelihood term : - Favor images which “looks like the observation”

- it is constant on x

(a posteriori probability )

We look for the image A maximizing P(x/B)...

Image restoration : Model

5

Many low level image processing

tasks …

… can be seen as an optimization problem (absolute minimum of an energy function)

bad

good

Conclusion

Optimisation problemAlvaro Cassinelli:

REM: Il existe bien d'autres formes d'estimateurs que le MAP.

Le modèle bayesien est aussi utile pour grand nombre de problèmes en traitement d'images autres que les problèmes de restauration, car il représente une façon formelle pour introduire des informations a priori dans le problème.

Alvaro Cassinelli:

REM: Il existe bien d'autres formes d'estimateurs que le MAP.

Le modèle bayesien est aussi utile pour grand nombre de problèmes en traitement d'images autres que les problèmes de restauration, car il représente une façon formelle pour introduire des informations a priori dans le problème.

)(),()/( xregUBxobs

UBxU +=

From observation model

),(exp1)/( BxU

ZBxP

rewriting the a posteriori probability as :

we define the « energy » of the image :

From regularisation model

6Exemple : binary images

… if image is binary :

.:::

....010

....110

....101

x

),( ),(),(),(

1.2.121.2.12)/(ji jiVlk

klijji

ijij xxBxBxU

Interaction neighborhood

),( ),(),(

2

),(

2)/(ji jiVlk

klijji

ijij xxBxBxU

Energy

Image B Image x

a priori : avoid discontinuities

Constraint to the data

Interpretation

Constraint to the data term Regularisation term

7Practical Resolution : Deterministic updating

image 256x256 Energy function has 65536 variables !Problem :

Exhaustive search ? Impossible ! (2256x256 1020000 images)

First idea : deterministic updating

Random initialisation :

For each pixel (i,j)…

U0=U(x0)=U(1,0,1,1,0,….)

U(1,0,?,1,0,….)

…choose the value that makes the energy decreases :

ijij UF = U(…, xij= 0, ...) - U(…, xij=1, ...)

" Local FORCE "Fij>0

New state 1

New state 0

Fij<0

8Practical resolution : stochastic optimization

The problem :

Deterministic optimization

local minima xO of the energy !

xO

U<0 U>0

x=A

…image is “trapped” in a

- controlled “mistakes” are allowed :

Simulated Annealing

Stochastic Optimization

Solution :

9

Repeat by slowly decreasing the « temperature » :

For each pixel (i,j), compute the local force :

… and choose more likely the state which makes the energy decrease, (but not systematically) :

T

Fx

ijij

exp1

1]1Pr[

Stochastic updating of image :

-4 -2 0 2 40

0,5

1

T=

T=0

Force

Pro

bab

iliy

ijij UF = U(…, xij= 0, ...) - U(…, xij=1, ...)

Practical resolution : Simulated Annealing Algorithm

10Practical resolution : Parallel Updating

For many applications, computation of the force is a local operation :

ji,)lk,(

klijij 1.212V

xBF

Parallel Processor

Interconnection topology

11Conclusion

Low-level Image processing

Optimization problem

Stochastic

Processor

P P P P S S

Energy Function

Parallel

Updating

Local force computation

Optoelectronics ...

OO

II) Optoelectronic implementation

• Motion detection demonstrator

• First prototype : CMOS 24x25 SPIE600 processor

• Demonstration : restoration of binary images

• Enhanced demonstrator using optical interconnects

13SPIE600 parallel processor

CMOS circuit 0,8 m

one bit memory xij

local force computation :

deterministic updating ! (comparator)

optical (2 PD) electrical (currents)

)ji,()lk,(

klijij 1.212V

xBF FC FS

Elementary processor

xij

FS

FC

Noise cleaning demonstrator

(on binary images)

15Noise cleaning demontrator : architecture

Laser source

CCD

SLM SPIE600

Corrupted binary image

IN

Restored image

OUT

pixel Bij

FS (-)

+-

FS (+)FC (+)

FC (-)

(dual rail-encoding)

PEij

xij

)ji,()lk,(

kijij 1.212V

lxBF FC FS

…deterministic updating !

16…stochastic updating

+

=

Speckle generator

Laser diode

multimode fiber

Rotating diffuser

Deterministic comparator (fixed threshold)

xij

FS (+)

FS (-)

FC (+)

FC (-)

“Stochastic” comparator T=0

T2>T1>0

Fij>0Fij<0P

rob

ab

ilité

Laser power T force locale

(seuil aléatoire)

FijT xij

17Optoelectronic Simulated Annealing

- 50 temperature steps

- several thousand updating cycles

- … everything in 40 ms

Simulated Annealing :

Image CCD (annealing)

Laser source

CCD

SLM SPIE600

from Speckle generator

18Noise cleaning demonstrator : view of the setup

19Noise cleaning demonstrator : some experimental results

Slow computer interface «only»  video rate (23 images/sec)

Without speckle (deterministic)

With speckle (stochastic)

Enhanced demonstrator with

Optical interconnections

21

Electrical interconnections : cumbersome, short range, fixed...

Shift-invariant neighborhood optical convolution !

Optical interconnections : optical convolutionAlvaro Cassinelli:

à partir de ce moment, les interconnexions électriques ne sont plus utilisées

Alvaro Cassinelli:

à partir de ce moment, les interconnexions électriques ne sont plus utilisées

Dammann gratings (2-level phase)

SLM (B) SPIE600

(4 neighboors)

(8 neighboors)

4-f architecture

+

-

+

-

+

-

+

-

PE (i,j-1)

PE (i+1,j)

PE (i-1,j)

+

-

PE (i,j+1)

Optical neighborhood

22Optical interconnection : test of optical loop

4 nearest neighborhood interconnection topology :

8 nearest neighborhood interconnection topology :

Alvaro Cassinelli:

-aberrations

-il faut détruire la coherence spatiale des sources laser des SLMs

- de toute manière, les irrégularités du voisinage n'affecteront pas outre mesure le résultat du traitement (voisinage)

Alvaro Cassinelli:

-aberrations

-il faut détruire la coherence spatiale des sources laser des SLMs

- de toute manière, les irrégularités du voisinage n'affecteront pas outre mesure le résultat du traitement (voisinage)

23

source laser

CCD

SLM (A)SPIE600SPIE600

From laser diode fiber A

FC

SLM (A)

Optically interconnected demonstrator : optical “force”

)ji,()lk,(

klijij 1.212V

xBF

SLM A

FC

SLM B + grating

FS

diode B power

diode A power =

Speckle generator

From laser diode fiber B grating

FS

SLM (B)

Electronic feed-back loop

(corrupted image)

input B

Aoutput

(restored image)

24

SLM-A

SLM-B

CCDSPIE600

Speckle generator input

Dammann grating

Laser diodes

21 cm

35 cm

14 cm

Optically interconnected demonstrator

Motion detection demonstrator

(with optical interconnections)

26Motion detection : principles

Goal :Input data : gray level image sequence

Output : image segmented in two motion regions (Binary Mask)

Hypothesis : Still camera

Quasi-constant scene illumination...

Relationship between luminance and motion

I(t)

MB(t)

27

- Previous field BM(t-1) is known…

- Observed field (input) CD(t) ...MB(t) ?

?

Changes Detection Binary field

I(t)

I(t-1)

BM(t)

-

BM(t-1)

?

Motion detection : algorithm

…without noise :

CD =1 : flip motion label

CD =0 : save motion label

MB(t)= CS(t) XOR MB(t-1)

XOR

…with noise :

!?

B(t)= CD(t) XOR BM(t-1) BM(t)

Restoration of the binary «image» :

XOR

Rest.

28

Detected motion(Binary Mask)

Motion detection : experimental results

Performances :

Input sequence Change Detection

24 sec/image (optical 8 nearest neigh.)

5 sec/image (optical 4 nearest neigh.)

0,04 sec/image (electronic 4 nearest neigh.)

III) Modeling of an hybrid Si/AsGa Processor

• Hybrid technology and monolithic convolution

• Performance estimation

30Hybrid prototype : “monolithic” convolution

…slow optical loop because of electronics !!

«Hybrid» technology

CMOS circuit : computation (threshold and memory)

MQW AsGa array : input/output

bonding pads

Modulation/detection

GaAs

Alvaro Cassinelli:

-la puce en AsGa : détecte et module

-la puce Silicium : calcule

Alvaro Cassinelli:

-la puce en AsGa : détecte et module

-la puce Silicium : calcule

“monolithic” feed-back ?

Source/detector pixel array !!

31Hybride CMOS/MQW prototype : evaluated architecture

(the image to be processed is entered electronically)

Global architecture

Hybrid circuit

grating

Read laser beam

Array illuminator

Elementary processor

modulators detectors

+-

xij

xij FS(+)

FS(-)

xij

Bij

- pair of modulators

- pair of detectors

- 2 bits : xij et Bij

32Prototype en technologie hybride : bilan optoélectronique

System performances are mainly limited by the optical power

0 10 20 30 40 50 60 70 80 90100

101

102

103

104

Heat dissipation

Q =5W

cadence

0 10 20 30 40 50 60 70 80 90100

101

102

103

104

Electrical power

P =5W

cadence

83,3 MHz

0 10 20 30 40 50 60 70 80 90100

101

102

103

104

105

106

Updating frequency (MHz)

Ptot= 1W

f4=73.3 MHz

f8=66.8 MHz

f12 =64.1 MHz

Hybrid array 16x32 PEs

Optical Power (mW)

33Hybrid CMOS/MQW prototype : final performances

In the worst case, 500 times video rate ! (independently on the array size)

Système de voisinage Nombre de recuits par seconde

SPIE600 (voisinage electronique) 23

SPIE600 (voisinage optique) < 0,02

Circuit hybride (voisinage 4) 36700

Circuit hybride (voisinage 8) 16700

Circuit hybride (voisinage 12) 12800

IV) Conclusion

Results

Possible research directions

35Results

Performances :

Optically interconnected system

Dedicated Machines for image processing

Optics in computing :

Optoelectronic demonstrator :

- noise cleaning, motion detection- Real time simulated annealing

- Silicon chip : 5 seconds/image- Hybrid circuit : tens of thousand images/sec

Optical random number generation

Compact

36

Inte

rco

nn

ecit

on

op

tica

l ban

dw

ith

Compared performances

10 101 10 2 103 104 10 5 106 10 7 1080

Number of optical interconnections in the system

1010

10

10

10

10

10

10

2

4

6

8

10

12

SYSTEM performances (bits/sec)

1012

1014

108

HERE/98

BS/98

CITR/95

AT&T/93

SPARCL/95

AT&T/91

PPOS4

E4O4

O8 O12

PPOS12

OHM4 2002OHM4 2007

FET-SEED

CMOS-SEED

S-SEED

CLF/CI + PnpN

CLF/CI + Si

VCSEL/MSM

Multi-chip

demonstrators

PPOS8

Tera Bit /sec

37Possible research directions...

Rapid and local reconfigurable interconnects :

Optical random number generator :

OSPP with reconfigurable optical interconnects :

…a general purpose optoelectronic stochastic parallel processor for low-level image processing ?

…stochastic neural networks (Boltzmann Machines)

...cryptograpy, telecom channel simulation, Monte Carlo methods...

Alvaro Cassinelli:

(speckle : Gb/s contre Mb/s pour puce VLSI, 1998)

Alvaro Cassinelli:

(speckle : Gb/s contre Mb/s pour puce VLSI, 1998) Micro-optic integration.

Recommended