Practical representations of incomplete probabilistic knowledge

  • Published on

  • View

  • Download

Embed Size (px)


<ul><li><p>Computational Statistics &amp; Data Analysis 51 (2006)</p><p>Practical representations of incomplete probabilistic knowledgeC. Baudrita,, D. Duboisb</p><p>aLaboratoire Mathmatiques et Applications, Physique Mathmatique dOrlans, Universit dOrlans, rue de Chartres BP 6759,Orlans 45067 cedex 2, France</p><p>bInstitut de Recherche en Informatique de Toulouse, Universit Paul Sabatier, 118 route de Narbonne 31062 Toulouse Cedex 4, FranceAvailable online 2 March 2006</p><p>Abstract</p><p>The compact representation of incomplete probabilistic knowledge which can be encountered in risk evaluation problems, forinstance in environmental studies is considered.Various kinds of knowledge are considered such as expert opinions about characteris-tics of distributions or poor statistical information. The approach is based on probability families encoded by possibility distributionsand belief functions. In each case, a technique for representing the available imprecise probabilistic information faithfully is pro-posed, using different uncertainty frameworks, such as possibility theory, probability theory, and belief functions, etc. Moreover theuse of probabilitypossibility transformations enables condence intervals to be encompassed by cuts of possibility distributions,thus making the representation stronger. The respective appropriateness of pairs of cumulative distributions, continuous possibilitydistributions or discrete random sets for representing information about the mean value, the mode, the median and other fractiles ofill-known probability distributions is discussed in detail. 2006 Elsevier B.V. All rights reserved.</p><p>Keywords: Imprecise probabilities; Possibility theory; Belief functions; Probability-boxes</p><p>1. Introduction</p><p>In risk analysis, uncertainties are often captured within a purely probabilistic framework. It suggests that all uncer-tainties whether of a random or an epistemic nature should be represented in the same way. Under this assumption,the uncertainty associated with each parameter of a mathematical model of some phenomenon can be described bya single probability distribution. According to the frequentist view, the occurrence of an event is a matter of chance.However, not all uncertainties are random nor can be objectively quantied, even if the choice of values for parametersis based as much as possible on on-site investigations. Due to time and nancial constraints, information regardingmodel parameters is often incomplete. For example, it is quite common for a hydrogeologist to estimate the numer-ical values of acquifer parameters in the form of condence intervals according to his/her experience and intuition(i.e. expert judgment). We are then faced with a problem of processing incomplete knowledge.</p><p>Overall, uncertainty regarding model parameters may have essentially two origins. It may arise from randomness dueto natural variability of observations resulting from heterogeneity (for instance, spatial heterogeneity) or the uctuationsof a quantity in time. Or it may be caused by imprecision due to a lack of information resulting, for example, fromsystematic measurement errors or expert opinions. As suggested by Ferson and Ginzburg (1996) and more recently</p><p> Corresponding author. Tel.: +33 6 80149785; fax: +33 2 38417205.E-mail addresses: (C. Baudrit), (D. Dubois).</p><p>0167-9473/$ - see front matter 2006 Elsevier B.V. All rights reserved.doi:10.1016/j.csda.2006.02.009</p></li><li><p>C. Baudrit, D. Dubois / Computational Statistics &amp; Data Analysis 51 (2006) 86108 87</p><p>developed by Helton et al. (2004), distinct representation methods are needed to adequately tell random variability(often referred to as aleatory uncertainty) from imprecision (often referred to as epistemic uncertainty).</p><p>A long philosophical tradition in probability theory dating back to Laplace demands that uniform distributions shouldbe used by default in the absence of specic information about frequencies of possible values. For instance, when anexpert gives his/her opinion on a parameter by claiming: I only know that the value of x lies in an interval A, theuniform probability with support A is used. This view is also justied on the basis of the maximum entropy approach(Gzyl, 1995). However, this point of view can be challenged. Adopting a uniform probability distribution to expressignorance is questionable. This choice introduces information that in fact is not available and may seriously bias theoutcome of risk analysis in a non-conservative manner (Ferson and Ginzburg, 1996). A more faithful representationof this knowledge on parameter x is to use the characteristic function of the set A, say such that (x) = 1,x Aand 0 otherwise. This is because is interpreted as a possibility distribution that encodes the family of all probabilitydistribution functions with support in A (Dubois and Prade, 1992). Indeed, there exists an innity of probabilitydistributions with support in A, and the uniform distribution is just one among them.</p><p>In the context of risk evaluation, the knowledge really available on parameters is often vague or incomplete. Thisknowledge is not enough to isolate a single probability distribution in the domain of each parameter. When faced withthis situation, representations of knowledge accepting such incompleteness look more in agreement with the availableinformation. Of course, the Bayesian subjectivist approach maintains that only a standard probabilistic representationof uncertainty is rational, but this claim relies on a betting interpretation that enforces the use of a single probabilitydistribution, in the scope of decision-making, not with a view to faithfully report the epistemic state of an agent (SeeDubois et al., 1996 for more discussions on this topic). In practice, while information regarding variability is bestconveyed using probability distributions, information regarding imprecision is more faithfully conveyed using familiesof probability distributions (Walley, 1991). At the practical, level such families are most easily encoded either byprobability boxes (upper and lower cumulative probability functions (Ferson et al., 2003; to appear) or by possibilitydistributions (also called fuzzy intervals) (Dubois and Prade, 1988; Dubois et al., 2000) or yet by belief functions ofShafer (1976).</p><p>This article proposes practical representation methods for incomplete probabilistic information, based on formallinks existing between possibility theory, imprecise probability and belief functions. These results can be applied formodelling inputs to uncertainty propagation algorithms. A preliminary draft of this paper is (Baudrit et al., 2004a).</p><p>In Section 2, we recall basics of probability-boxes (upper &amp; lower cumulative distribution functions), possibilitydistributions and belief functions. We also recall the links between these representations. All of them can encodefamilies of probability functions. In Section 3, the expressive power of probability boxes and possibility distributionsis compared. In Section 4, some results on the relation between prediction intervals and possibility theory are recalled.It allows a stronger form of encoding of a probability family by a possibility distribution, whereby the cuts of the latterenclose the prediction intervals of the probability functions. In Sections 5 and 6, we consider a non exhaustive list ofknowledge types that one may meet after an information collection step in problems like environmental risk evaluation.We especially focus on incomplete non-parametric models, for which only some characteristic values are known, suchas the mode, the mean or the median and other fractiles of the distribution. For each case we propose an adaptedrepresentation in terms of p-boxes, belief functions (Section 5), and especially possibility distributions (Section 6).</p><p>2. Formal frameworks for representing imprecise probability</p><p>Consider a measurable space (,A) whereA is an algebra of measurable subsets of . LetP be a set of probabilitymeasures on the referential (,A). For all A measurable, we dene:</p><p>its upper probability</p><p>P(A) = supPP</p><p>P(A)</p><p>and its lower probability</p><p>P(A) = infPP</p><p>P(A).</p></li><li><p>88 C. Baudrit, D. Dubois / Computational Statistics &amp; Data Analysis 51 (2006) 86108</p><p>Such a family may be natural to consider if a probabilistic parametric model is used but the parameters such as themean value or the variance are ill-known (for instance they lie in an interval). It can be also obtained if the probabilisticmodel relies on imprecise (e.g. set-valued) statistics (Jaffray, 1992), or yet incomplete statistical information (only a setof conditional probabilities is available). In a subjectivist tradition, the lower probability P(A) can be interpreted as themaximal price one would be willing to pay for the gamble A, which pays 1 unit if event A occurs (and nothing otherwise)(Walley, 1991). Thus, P(A) is the maximal betting rate at which one would be disposed to bet on A. That means P(A)is a measure of evidence in favour of event A. The upper probability P(A) can be interpreted as the minimal sellingprice for the gamble A, or as one minus the maximal rate at which an agent would bet against A (Walley, 1991). Thatmeans P(A) measures the lack of evidence against A since we have:</p><p>P(A) = 1 P (Ac) .It is clear that representing and reasoning with a family of probabilities may be very complex. In the following weconsider three frameworks for representing special sets of probability functions, which are more convenient for apractical handling. We review three modes of representation of uncertainty that can be cast in the imprecise probabilitymodel.</p><p>2.1. Probability boxes</p><p>Let X be a random variable on (,A). Recall that a cumulative distribution function is a non decreasing functionF : R [0, 1] assigning to each x R the value P(X (, x]). This function encodes all the informationpertaining to a probability measure, and is often very useful in practice.</p><p>A natural model of an ill-known probability measure is thus obtained by considering a pair(F,F</p><p>)of non-intersecting</p><p>cumulative distribution functions, generalising an interval. The interval[F,F</p><p>]is called a probability box (p-box)</p><p>(Ferson et al., 2003; to appear). A p-box encodes the class of probability measures whose cumulative distributionfunctions F are restricted by the bounding pair of cumulative distribution functions F and F such that</p><p>F(x)F(x)F(x) x R.A p-box can be induced from a probability family P by</p><p>x R F(x) = P((, x])and</p><p>x R F(x) = P((, x]).Let P</p><p>(P </p></li><li><p>C. Baudrit, D. Dubois / Computational Statistics &amp; Data Analysis 51 (2006) 86108 89</p><p>within a certain interval: the possibility and the necessity N. The normalized measure of possibility (respectivelynecessity N) is dened from the possibility distribution : R [0, 1] such that supxR (x) = 1 as follows:</p><p>(A) = supxA</p><p>(x) (1)</p><p>and</p><p>N(A) = 1 (A)= infx /A(1 (x)). (2)</p><p> The possibility measure veries:A,B R (A B) = max((A),(B)). (3)</p><p> The necessity measure N veries:A,B R N(A B) = min(N(A),N(B)). (4)</p><p>A possibility distribution 1 is more specic than another one 2 in the wide sense as soon as 12, i.e. 1 is moreinformative than 2.</p><p>A unimodal numerical possibility distribution may also be viewed as a nested set of condence intervals, which arethe -cuts</p><p>[x, x</p><p>]= {x, (x)} of . The degree of certainty that [x, x] contains X is N ([x, x]) (=1 if is continuous). Conversely, a nested set of intervals Ai with degrees of certainty i that Ai contains X is equivalent tothe possibility distribution</p><p>(x) = mini=1...n {1 i , x Ai} ,</p><p>provided that i is interpreted as a lower bound on N (Ai), and is chosen as the least specic possibility distributionsatisfying these inequalities (Dubois and Prade, 1992).</p><p>We can interpret any pair of dual functions necessity/possibility [N,] as upper and lower probabilities inducedfrom specic probability families.</p><p> Let be a possibility distribution inducing a pair of functions [N,]. We dene the probability family P() ={P,A measurable, N(A)P(A)} = {P,A measurable, P(A)(A)}. In this case, supPP()P (A) = (A)and infPP()P (A) = N(A) (see De Cooman and Aeyels, 1999; Dubois and Prade, 1992) hold. In other words, thefamily P() is entirely determined by the probability intervals it generates.</p><p> Suppose pairs (interval Ai , necessity weight i) supplied by an expert are interpreted as stating that the probabilityP (Ai) is at least equal to i where Ai is a measurable set. We dene the probability family as follows: P() ={P,Ai, iP(Ai)}. We thus know that P = and P = N (see Dubois and Prade, 1992, and in the innite caseDe Cooman and Aeyels, 1999).</p><p>2.3. Imprecise probability induced by random intervals</p><p>The theory of imprecise probabilities introduced by Dempster (1967) (and elaborated further by Shafer, 1976 andSmets and Kennes, 1994 in a different context) allows imprecision and variability to be treated separately within asingle framework. Indeed, it provides mathematical tools to process information which is at the same time of randomand imprecise nature. Contrary to probability theory, which in the nite case assigns probability weights to atoms(elements of the referential), in this approach we may assign such weights to any subset, called focal set, with theunderstanding that portions of this weight may move freely from one element to another in a focal set. We typically ndthis kind of knowledge when some measurement device is tainted with limited perception capabilities and a randomerror (variability) due to the variability of a phenomenon. We may obtain a sample of random intervals of the form([mi ,mi + </p><p>])i=1...K supposedly containing the true value, where is a perception threshold, mi is a measured</p><p>value and K is the number of interval observations. Each interval is attached a probability i of observing the measured</p></li><li><p>90 C. Baudrit, D. Dubois / Computational Statistics &amp; Data Analysis 51 (2006) 86108</p><p>value mi . That is, we obtain a mass distribution (i )i=1...K on intervals, thus dening a random interval. The probabilitymass i can be freely re-allocated to points within interval</p><p>[mi ,mi + </p><p>]. However, there is not enough information</p><p>to do it.Like possibility theory, this theory provides two indicators, called plausibility Pl and belief Bel by Shafer (1976).</p><p>They qualify the validity of a proposition stating that the value of variable X should lie within a set A (a certain intervalfor example). Plausibility Pl and belief Bel measures are dened from the mass distribution assigning positive weightsto a nite setF of measurable subsets of :</p><p> :F [0, 1] such thatEF</p><p>(E) = 1, (5)</p><p>as follows:</p><p>Bel(A) =</p><p>E,EA(E) (6)</p><p>and</p><p>Pl(A) =</p><p>E,EA=(E) = 1 Bel (A) , (7)</p><p>where E F is called a focal element. Bel(A) gathers the imprecise evidence that asserts A; Pl(A) gathers theimprecise evidence that does not contradict A.</p><p>A mass distribution may encode the probability family P() = {P,A measurable, Bel(A)P(A)} = {P,Ameasurable, P(A)Pl(A)} (Dempster, 1967). In this case we have: P = Pl and P = Bel, so that</p><p>P P(),...</p></li></ul>