A unifying framework for lossless and progressive image coding

  • Published on
    02-Jul-2016

  • View
    214

  • Download
    2

Embed Size (px)

Transcript

  • Pattern Recognition 35 (2002) 627638www.elsevier.com/locate/patcog

    A unifying framework for lossless and progressiveimage coding

    Amel Benazza-Benyahiaa, Jean-Christophe Pesquetb; aDepartement de Mathematiques Appliquees, Signal et Communications, Ecole Superieure des Communications de Tunis,

    Parc Technologique des Communications, Cite El Ghazala, 2083, Ariana, TunisiabLaboratoire Syst%emes de Communication, Universite de Marne-la-Vallee, 5 Boulevard Descartes, Champs sur Marne 77454

    Marne la Vallee Cedex 2, France

    Received 16 November 2000; accepted 16 November 2000

    Abstract

    Progressive coding is a desirable feature for image database telebrowsing or image transmissions over low bandwidthchannels. Furthermore, for some applications, exact image reconstruction is required. In this paper, we show that mostof the lossless and progressive coders can be described by a common nonlinear subband decomposition scheme. Thisunifying framework provides useful guidelines for the analysis and improvement of the considered decompositionmethods. Finally, we compare the respective performances of these methods in terms of Shannon entropy for severalimages and also evaluate their compression ability when combined with a hierarchical coding technique. ? 2001 PatternRecognition Society. Published by Elsevier Science Ltd. All rights reserved.

    Keywords: Lossless image compression; Progressive image reconstruction; Nonlinear decompositions; Subband coding; Wavelets

    1. Introduction

    As a result of the generalized use of images, there is anincreasing need for e8cient image compression methods.Furthermore, the ability of progressively reconstructingthe images constitutes a useful functionality for compres-sion schemes. Indeed, a single compressed 9le may beshared by several target output devices with di:erent res-olutions. Besides, during an interactive visual search overa large image database, an initial part of the informationallows the receiving device to get a crude approxima-tion of the image. As transmission proceeds, the receivergradually constructs approximations of increased visualquality. If the intermediate displayed image is judged ofno interest, the transmission can be aborted. This results

    Corresponding author.E-mail addresses: ben.yahia@planet.tn (A. Benazza-

    Benyahia), pesquet@univ-mlv.fr (J.-C. Pesquet).

    in a gain in terms of bit rate, especially when the userdoes not completely download the image. Multireso-lution decompositions [1] are very compelling toolsfor progressive image coding as they provide hierar-chical representations of images. However, for med-ical imaging or remote sensing, lossless compressionis sometimes required since the least distortion in thereconstructed image can lead to errors in data inter-pretation or diagnosis. In this context, the greatest caremust be taken in order to ensure an exact coding. Thus,the very large amount of data involved in the 9eld ofpicture archiving=transmission systems must be cou-pled with the fact that no distortion is allowed. To thisrespect, since the pioneering works of Burt and Adel-son [2], a certain number of pyramidal decompositionswhich are information preserving, have been proposed.In recent works [3,4], nonlinear multirate 9lter bankswith maximal decimation and perfect reconstructionwere investigated. These 9lter banks are closely related

    0031-3203/01/$22.00 ? 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.PII: S0031-3203(01)00065-6

  • 628 A. Benazza-Benyahia, J.-C. Pesquet / Pattern Recognition 35 (2002) 627638

    to other recent developments concerning the liftingscheme [5,6] and nonlinear multiresolution analyses[7].Our objective in this study is to show that the analysis

    scheme described in Ref. [4] provides a unifying frame-work for most of the decompositions proposed until nowin the context of progressive and exact coding. This pa-per is organized as follows. Section 2 describes generalschemes for nonlinear subband decomposition. Section 3allows us to review most of the existing pyramidal codersin terms of such analysis schemes. Sections 3.1 and 3.2are, respectively, devoted to Laplacian-like pyramids andinterpolative pyramids. Section 4 is concerned with non-linear wavelet-like subband decompositions which ap-pear as special cases of the considered decompositionschemes. Finally, Section 6 provides some experimen-tal results in order to compare the performances of thesecoders.

    2. Considered decomposition scheme

    In Refs. [4,8], a generalization of the matrix triangu-larization procedure to a linear 1D subband decomposi-tion was shown to lead to the class of nonlinear M -banddecompositions depicted in Fig. 1. Such a decompositionscheme will be denoted by NL1. The resulting subbandscan be obtained by the following equations:

    y1 =A1[x1] + C1[z2; : : : ; zM ]; (1a)

    yi =Ai[xi] +Bi[A1[x1]; : : : ;Ai1[xi1]]

    +Ci[zi+1; : : : ; zM ]; i {2; : : : ; M 1}; (1b)

    yM =AM [xM ] +BM [A1[x1]; : : : ;AM1[xM1]]; (1c)

    Fig. 1. M -band NL1 decomposition 9lter bank with exact reconstruction (derived from the linear case).

    where x1; : : : ; xM are the polyphase components (of orderM) of the 1D signal x to be decomposed. Furthermore,we have

    i {1; : : : ; M 1}; zi = yi Ci[zi+1; : : : ; zM ] andzM = yM : (2)

    Perfect reconstruction is guaranteed for arbitrary oper-atorsBi ;Ci and injective operatorsAi, for i {1; : : : ; M}[4]. In particular, these operators can be linear or non-linear 9lters. The associated synthesis 9lter bank is pro-vided in Fig. 2. By combining the operators Bi and Cifor each subband, a more general structure (denoted byNL2) is obtained as shown in Fig. 3. This can be seen byrewriting Eqs. (1a)(1c) as functions of xi and yj only:

    y1 =A1[x1] + C1[y2; : : : ; yM ];

    yi =Ai[xi] +Bi[A1[x1]; : : : ;Ai1[xi1]]

    +Ci [yi+1; : : : ; yM ]; i {2; : : : ; M 1};yM =AM [xM ] +BM [A1[x1]; : : : ;AM1[xM1]]; (3)

    whereCi [yi+1; : : : ; yM ] is calculated fromCi[zi+1; : : : ; zM ]by recursively replacing zi by yi Ci[zi+1; : : : ; zM ] start-ing with zM and continuing with zM1 and so on. Thus,such a subband decomposition with exact reconstructionis a very versatile tool for building multiresolution de-compositions.As will be seen subsequently, two-band decomposi-

    tions are very often used (M=2). In Fig. 4, we propose asomewhat more sophisticated variant of the two-band de-composition. The operator P1 may be useful in strength-ening the prediction operation performed by the operatorB2 as it allows to introduce a dependence upon either

  • A. Benazza-Benyahia, J.-C. Pesquet / Pattern Recognition 35 (2002) 627638 629

    Fig. 2. M -band synthesis 9lter bank associated with the decomposition 9lter bank depicted in Fig. 1.

    Fig. 3. NL2 structure generalizing the M -band decomposition structure proposed in Fig. 1.

    past or future samples of x2. OperatorsP2 andP3 may beuseful in decorrelating the output signals. For instance,they can be thought of as tools for an additional di:er-ential predictive coding stage. For the perfect recoveryof the signal, the operators P1;P2 and P3 must be eitherstrictly causal or anti-causal. 1 It must be noted that theproposed structure is similar to the structure describedin Ref. [9]. However, we do not restrict here the opera-tors to be in9nite impulse response linear 9lters as donein Ref. [9]. Indeed, any kind of nonlinear operators canbe used such as integer-to-integer operators, rank order9lters or morphological operators.

    1 An operator with input signal x(n) and output signal y(n)is strictly causal (resp. anticausal) if y(n) is a function ofx(n 1); x(n 2); : : : (resp. x(n+ 1); x(n+ 2); : : : ).

    Fig. 4. General analysis 9lter bank (M = 2).

    As our objective is image coding, the extension to the2D case must be realized. For the sake of simplicity,this can be handled in a separable manner but other sub-sampling lattices (such as quincunx lattices) can also beenvisaged.

  • 630 A. Benazza-Benyahia, J.-C. Pesquet / Pattern Recognition 35 (2002) 627638

    3. Review of pyramidal coders

    As already mentioned in the introduction, an impor-tant merit of the decomposition structure in Eq. (3) is toprovide a unifying view of the main lossless compres-sion schemes based on nonlinear multiresolution anal-yses. It is worth noting that pyramidal structures [1,2]have been recognized early as very interesting tools forlossless progressive image coding. It is possible to dis-tinguish two families of pyramidal structures accordingto the process employed to generate them. The 9rst classcontains Laplacian-like pyramids while the second onecorresponds to interpolative pyramids. In this section, weare interested in demonstrating that the main pyramidalnonexpansive structures belonging to these two classesare special cases of the NL1 decomposition.

    3.1. Laplacian-like pyramids

    3.1.1. BurtAdelson pyramidThe input image x(m; n) of size 2JN 2JM is repre-

    sented by a collection of layers containing the sub-imagesf(j)(m; n) of size 2JjM 2JjN (j = 0; : : : ; J ). Thereare di:erent ways of constructing such a pyramid [2,10].Generally, decimated and averaged versions g(j+1)(m; n)of x(m; n) are recursively generated:

    g(j+1)(m; n) =Ld

    ; s=Ldd(r)d(s)g(j)(2m r; 2n s);

    06 j 6 J 1 (4)with g(0)(m; n) , x(m; n) and the separable low-pass9lter is deduced from a 1D kernel {d(r); r=Ld; : : : ; Ld}.On the one hand, the corresponding approximation imageis used to estimate the original image at resolution 2j bymeans of an up-sampling procedure and an interpolation9lter. This latter one is derived in a separable mannerfrom a kernel {e(r); r=Le; : : : ; Le}. Then, the di:erencebetween the original image and the so-obtained estimateconstitutes the jth level of the Laplacian-like pyramid:

    f(j)(m; n),

    g(j)(m; n) Ler; s=Le

    e(r)e(s)g ( j+1)(m r; n s) if jJ;

    g(J )(m; n) if j = J;

    g ( j+1)(m; n),

    g(j+1)(2m; 2n) if m and n are even;

    0 otherwise:(5)

    where denotes the rounding operator. On the otherhand, the low-resolution image is the input image for thenext step of 9ltering and decimation. Repeating the pro-cess leads to BurtAdelson pyramids [2]. Sending thedata from top (j=J ) to bottom (j=0), the original imagecan be recovered level by level with gradual re9nement.However, in the general case, the hierarchical structurerequires 1=3 more data than the source image. Althoughthis increase in data size is compensated by the levelof decorrelation achieved, some e:orts were devoted inorder to provide nonredundant pyramids such as the re-duced Laplacian pyramids (RLP) [11] and the minimumentropy pyramids (MEP) [12]. These pyramids have theadvantage of containing the same number of samples asthe original image.

    3.1.2. Reduced Laplacian and minimum entropypyramidsReduced Laplacian pyramids are nonexpansive pyra-

    mids corresponding to the special case of unit impulseresponse for the decimation 9lter (d(r) = (r)) and thefollowing interpolation kernel: e(r)=(r)+b(r1)+(0:5 b)(r 3), where b is an adjustable parameterthat determines the shape of the frequency response ofthe interpolation 9lter. Suitable values for b range from0.5 to 0.5625. The MEP decomposition is obtained whenb=0:5. This selected pair of kernel forces one fourth ofthe pixels in all the pyramid subimages to zero. Therefore,the input image is represented by an equivalent pyramidof the same number of pixels as in the input image.Initially, RLP and MEP were presented and developed

    with the classical formalism of pyramidal decomposi-tions. In fact, it is possible to express them as a 4-band2D nonlinear subband decomposition which 9ts into ourNL1 decomposition. Indeed, let the polyphase compo-nents of the di:erence image f(1)(m; n) be denoted byyi(m; n); i {1; 2; 3; 4}:y1(m; n),f

    (1)(2m 1; 2n 1);y2(m; n),f

    (1)(2m 1; 2n);y3(m; n),f

    (1)(2m; 2n 1); y4(m; n),f(1)(2m; 2n):(6)

    We then have

    y1(m; n) = x1(m; n)

    r odd

    s odd

    e(r)e(s)x4

    (m+

    r 12

    ; n+s 12

    );

    y2(m; n) = x2(m; n)

    r odd

    s even

    e(r)e(s)x4

    (m+

    r 12

    ; n+s2

    ):

  • A. Benazza-Benyahia, J.-C. Pesquet / Pattern Recognition 35 (2002) 627638 631

    Fig. 5. Arrangement of the input polyphase signals (a), supports of the 9lters calculating y1(m; n) (b), y2(m; n) (c) and y3(m; n)(d). Shaded areas denote non-zero coe8cients, equal gray levels mean identical coe8cient values.

    = x2(m; n)

    r odd

    e(r)x4

    (m+

    r 12

    ; n)

    ;

    y3(m; n)= x3(m; n)

    r even

    s odd

    e(r)e(s)x4

    (m+

    r2; n+

    s 12

    )

    = x3(m; n)

    s odd

    e(s)x4

    (m; n+

    s 12

    );

    y4(m; n) = x4(m; n); (7)

    where

    x1(m; n),x(2m 1; 2n 1) x2(m; n),x(2m 1; 2n);x3(m; n),x(2m; 2n 1) x4(m; n),x(2m; 2n): (8)The arrangement of the polyphase input components

    as well as the supports for the 9lters are shown in Fig.5. We can see therefore that RLP is a special case ofthe 4-band nonlinear decomposition with perfect recon-struction depicted in Fig. 1 since Ai ; i {1; 2; 3; 4} be-come the identity operators, B2 =B3 =B4 = 0, and theremaining operators C1;C2 and C3 simplify as they areonly functions of z4 = x4.

    3.2. Interpolation pyramidal coders

    3.2.1. Hierarchical interpolation pyramidAnother approach to the size expansion problem

    of BurtAdelson pyramidal representation consists incoding a sub-sampled image as well as the di:erencesbetween the remaining pixels and estimations result-ing from successive interpolation stages. A very earlyversion referred to as the Hierarchical INTerpolation(HINT) pyramid was proposed by Endoh and Yamazaki[13]. An example of the interpolation technique of order4 is illustrated in Fig. 6 where 4 4 blocks are con-sidered. It is interesting to point out that higher orderinterpolations could be employed (such as an interpola-tion of order 8 or 16). Pixels marked with correspond

    Fig. 6. 44 blocks used in the HINT pyramidal decomposition.

    to the original image decimated by a factor of 4: theyconstitute the low-resolution image that is 9rst sentby means of a standard coding technique, e.g. DPCM.Nodes of type are then predicted from the 4 nearestneighbors of type . The predicted value is roundedo: to guarantee reversibility and then subtracted fromthe original value to form the prediction error which issent in a second step. Then, pixels X are estimated fromnodes and so as to continue the di:erential coding.The process is iterated in order to interpolate the pixelsof type and . Finally, the transmitted image is formedby the error terms and the pixels. The reconstructionprocedure is quite simple. First, a rough approxima-tion of the original image ( pixels) is received. Theintermediate nodes are exactly calculated by interpo-lating the 4 nearest neighbors of the low-pass subimageand adding to this estimation the corresponding errorterm. The other pixels are subsequently reconstructed inthe same way. HINT method is included in the NL1

  • 632 A. Benazza-Benyahia, J.-C. Pesquet / Pattern Recognition 35 (2002) 627638

    decomposition framework. Indeed, denote by xk; l(n; m),x(4nk; 4ml); (k; l) {0; 1; 2; 3}2, the 44 polyphasecomponents of x(n; m). If these components are ap-propriately re-indexed and the interpolation operators(including rounding) are denoted...