52
A survey of multiple precision computation using floating-point arithmetic Fourth International Workshop on Taylor Methods Christoph Quirin Lauter Laboratoire de l’Informatique et du Parall´ elisme ´ Ecole Normale Sup´ erieure de Lyon Boca Raton, December 16 - 19 ECOLE NORMALE SUPERIEURE DE LYON

A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

A survey of multiple precision computation usingfloating-point arithmetic

Fourth International Workshop on Taylor Methods

Christoph Quirin Lauter

Laboratoire de l’Informatique et du ParallelismeEcole Normale Superieure de Lyon

Boca Raton, December 16 - 19

ECOLE NORMALE SUPERIEURE DE LYON

Page 2: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Motivation

Motivation

Exact floating-point arithmetic

Double-double, triple-double and expansion arithmetic

Multiple precision using floating-point - Lauter - TMW 2006 1

Page 3: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Project by Arenaire at ENS de Lyon

crlibm1: correctly rounded elementary function library

Elementary functions as in an usual libm:expsincos. . .

Evaluating elementary functions means evaluating polynomials

Correct rounding requires high accuracy and complete proofs

1http://lipforge.ens-lyon.fr/www/crlibm/

Multiple precision using floating-point - Lauter - TMW 2006 2

Page 4: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Project by Arenaire at ENS de Lyon

crlibm1: correctly rounded elementary function library

Elementary functions as in an usual libm:expsincos. . .

Evaluating elementary functions means evaluating polynomials

Correct rounding requires high accuracy and complete proofs

1http://lipforge.ens-lyon.fr/www/crlibm/

Multiple precision using floating-point - Lauter - TMW 2006 2

Page 5: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Project by Arenaire at ENS de Lyon

crlibm1: correctly rounded elementary function library

Elementary functions as in an usual libm:expsincos. . .

Evaluating elementary functions means evaluating polynomials

Correct rounding requires high accuracy and complete proofs

1http://lipforge.ens-lyon.fr/www/crlibm/

Multiple precision using floating-point - Lauter - TMW 2006 2

Page 6: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Project by Arenaire at ENS de Lyon

crlibm1: correctly rounded elementary function library

Elementary functions as in an usual libm:expsincos. . .

Evaluating elementary functions means evaluating polynomials

Correct rounding requires high accuracy and complete proofs

1http://lipforge.ens-lyon.fr/www/crlibm/

Multiple precision using floating-point - Lauter - TMW 2006 2

Page 7: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Need for more precision

IEEE 754 double precision offers 53 bits of precision

In crlibm, we need an accuracy of “120 correct bits”

In Taylor models, no use of high order polynomials if theremainder grows too fast

First approach:

Use an integer based fixed high precision floating-point libraryNecessity to leave the floating-point pipelineHigh impact on performance (factor 100)

Second approach:

Emulate higher precision in floating-pointReusage of already computed floating-point values possibleNo conversions, fill completely floating-point pipelineSpeed-up by at least a factor 10 w.r.t. the first approachSame quality of certification possible

Multiple precision using floating-point - Lauter - TMW 2006 3

Page 8: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Need for more precision

IEEE 754 double precision offers 53 bits of precision

In crlibm, we need an accuracy of “120 correct bits”

In Taylor models, no use of high order polynomials if theremainder grows too fast

First approach:

Use an integer based fixed high precision floating-point libraryNecessity to leave the floating-point pipelineHigh impact on performance (factor 100)

Second approach:

Emulate higher precision in floating-pointReusage of already computed floating-point values possibleNo conversions, fill completely floating-point pipelineSpeed-up by at least a factor 10 w.r.t. the first approachSame quality of certification possible

Multiple precision using floating-point - Lauter - TMW 2006 3

Page 9: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Need for more precision

IEEE 754 double precision offers 53 bits of precision

In crlibm, we need an accuracy of “120 correct bits”

In Taylor models, no use of high order polynomials if theremainder grows too fast

First approach:

Use an integer based fixed high precision floating-point libraryNecessity to leave the floating-point pipelineHigh impact on performance (factor 100)

Second approach:

Emulate higher precision in floating-pointReusage of already computed floating-point values possibleNo conversions, fill completely floating-point pipelineSpeed-up by at least a factor 10 w.r.t. the first approachSame quality of certification possible

Multiple precision using floating-point - Lauter - TMW 2006 3

Page 10: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Higher precision in floating-point

Floating-point expansions:

53 bits53 bits 53 bits

159 bits

FP - expansion

high precision significand

Operations on expansions: for example addition:

������������

������������

+ bh

δ (error)

clcmch

bm bl

ah am al

Multiple precision using floating-point - Lauter - TMW 2006 4

Page 11: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Need for exact floating-point arithmetic

We want to implement:

������������

������������

+ bh

δ (error)

clcmch

bm bl

ah am al

Single step:

+

ah

bh

temp1h temp1l

. . .

temp1h + temp1l = ah + bh

am

Multiple precision using floating-point - Lauter - TMW 2006 5

Page 12: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Exact floating-point arithmetic

Motivation

Exact floating-point arithmetic

Double-double, triple-double and expansion arithmetic

Multiple precision using floating-point - Lauter - TMW 2006 6

Page 13: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Exact floating-point arithmetic?

Floating-point arithmetic can produce round-off error

a⊗ b = a · b · (1 + ε)

where |ε| ≤ 2−p

A floating-point operation is called exact if its result is themathematical one

a⊗ b = a · bε = 0

However: floating-point arithmetic is often exact:Floating-point numbers are scaled integersIf no integer overflow occurs, operations are exact on integersJust factorize the scale (where possible)

a⊗ b = 2Ea ·ma ⊗ 2Eb ·mb = 2Ea+Eb · ◦ (ma ·mb)

where ◦ is the rounding operator satisfying

∀x ∈ F . ◦ (x) = x

Multiple precision using floating-point - Lauter - TMW 2006 7

Page 14: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Exact floating-point arithmetic?

Floating-point arithmetic can produce round-off error

a⊗ b = a · b · (1 + ε)

where |ε| ≤ 2−p

A floating-point operation is called exact if its result is themathematical one

a⊗ b = a · bε = 0

However: floating-point arithmetic is often exact:Floating-point numbers are scaled integersIf no integer overflow occurs, operations are exact on integersJust factorize the scale (where possible)

a⊗ b = 2Ea ·ma ⊗ 2Eb ·mb = 2Ea+Eb · ◦ (ma ·mb)

where ◦ is the rounding operator satisfying

∀x ∈ F . ◦ (x) = x

Multiple precision using floating-point - Lauter - TMW 2006 7

Page 15: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Exact floating-point arithmetic?

Floating-point arithmetic can produce round-off error

a⊗ b = a · b · (1 + ε)

where |ε| ≤ 2−p

A floating-point operation is called exact if its result is themathematical one

a⊗ b = a · bε = 0

However: floating-point arithmetic is often exact:Floating-point numbers are scaled integersIf no integer overflow occurs, operations are exact on integersJust factorize the scale (where possible)

a⊗ b = 2Ea ·ma ⊗ 2Eb ·mb = 2Ea+Eb · ◦ (ma ·mb)

where ◦ is the rounding operator satisfying

∀x ∈ F . ◦ (x) = x

Multiple precision using floating-point - Lauter - TMW 2006 7

Page 16: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Disclaimer

If tomorrow, you want to implement what I am going to show inthe next slides, remember that...

Code here is in C and that Fortran behaves differently

Implicit parentheses are elsewhere but our exact FP arithmeticrequires the indicated operation orderTyping of mixed precision expressions is different“Optimizations” the compiler is allowed to do are different

Declaring variables as double x,y,z; does not imply usageof IEEE 754 double precision on most systems

Round-to-nearest rounding mode required by some exactarithmetic sequences, in particular for exact multiplication

Special care is needed for subnormals, underflow and overflow

Multiple precision using floating-point - Lauter - TMW 2006 8

Page 17: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Sterbenz’ lemma

Let be a, b ∈ F such that

sgn(a) = sgn(b)

and1

2· |a| ≤ |b| ≤ 2 · |a|

Thusa b = a− b

2E

a = 2E · ma

b = 2E · mb

a b = 2E · (ma − mb)

At the base of most extended precision addition algorithms

Independent of the rounding mode

Proof intuition: factor the scale of both scaled integers thatare a and b

Multiple precision using floating-point - Lauter - TMW 2006 9

Page 18: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Sterbenz’ lemma

Let be a, b ∈ F such that

sgn(a) = sgn(b)

and1

2· |a| ≤ |b| ≤ 2 · |a|

Thusa b = a− b

2E

a = 2E · ma

b = 2E · mb

a b = 2E · (ma − mb)

At the base of most extended precision addition algorithms

Independent of the rounding mode

Proof intuition: factor the scale of both scaled integers thatare a and b

Multiple precision using floating-point - Lauter - TMW 2006 9

Page 19: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Fast2Sum

Let round-to-nearest the current rounding mode in IEEE 754.Let a, b ∈ F such that |a| ≥ |b|.Let be s, r ∈ F computed by

1 s = a + b ;2 t = s − a ;3 r = b − t ;

+

s r

b

a

Thuss + r = a + b

and|r | ≤ ulp(s)

Proof intuition: apply Sterbenz’ lemmaMeaning of s and r : s is a approximate sum, r the absolute error

Multiple precision using floating-point - Lauter - TMW 2006 10

Page 20: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Fast2Sum

Let round-to-nearest the current rounding mode in IEEE 754.Let a, b ∈ F such that |a| ≥ |b|.Let be s, r ∈ F computed by

1 s = a + b ;2 t = s − a ;3 r = b − t ;

+

s r

b

a

Thuss + r = a + b

and|r | ≤ ulp(s)

Proof intuition: apply Sterbenz’ lemma

Meaning of s and r : s is a approximate sum, r the absolute error

Multiple precision using floating-point - Lauter - TMW 2006 10

Page 21: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Fast2Sum

Let round-to-nearest the current rounding mode in IEEE 754.Let a, b ∈ F such that |a| ≥ |b|.Let be s, r ∈ F computed by

1 s = a + b ;2 t = s − a ;3 r = b − t ;

+

s r

b

a

Thuss + r = a + b

and|r | ≤ ulp(s)

Proof intuition: apply Sterbenz’ lemmaMeaning of s and r : s is a approximate sum, r the absolute error

Multiple precision using floating-point - Lauter - TMW 2006 10

Page 22: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

2Sum

Let round-to-nearest the current rounding mode in IEEE 754.Let a, b ∈ F.Let be s, r ∈ F computed by

1 s = a + b ;2 i f ( f a b s ( a ) >= fab s ( b ) ) {3 t = s − a ;4 r = b − t ;5 } e l s e {6 t = s − b ;7 r = a − t ;8 }

Thuss + r = a + b

and|r | ≤ ulp(s)

Multiple precision using floating-point - Lauter - TMW 2006 11

Page 23: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Branches ?

There are branches!

Branches are expensive on current pipelined processors!

Multiple precision using floating-point - Lauter - TMW 2006 12

Page 24: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

2Sum - avoiding branches

Let round-to-nearest the current rounding mode in IEEE 754.Let a, b ∈ F.Let be s, r ∈ F computed by

1 s = a + b ;2 t1 = s − a ;3 t2 = s − b ;4 d1 = b − t1 ;5 d2 = a − t2 ;6 r = d1 + d2 ;

Thuss + r = a + b

and|r | ≤ ulp(s)

⇒ Up to 10% performance gain w.r.t. branching version !

Multiple precision using floating-point - Lauter - TMW 2006 13

Page 25: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

2Sum - avoiding branches

Let round-to-nearest the current rounding mode in IEEE 754.Let a, b ∈ F.Let be s, r ∈ F computed by

1 s = a + b ;2 t1 = s − a ;3 t2 = s − b ;4 d1 = b − t1 ;5 d2 = a − t2 ;6 r = d1 + d2 ;

Thuss + r = a + b

and|r | ≤ ulp(s)

⇒ Up to 10% performance gain w.r.t. branching version !Multiple precision using floating-point - Lauter - TMW 2006 13

Page 26: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Round-to-nearest ?

Round-to-nearest mode required?

I am doing interval arithmetic and I do not like to change therounding-mode !

Multiple precision using floating-point - Lauter - TMW 2006 14

Page 27: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Fast2Sum - any rounding mode

Let a, b ∈ F such that |a| ≥ |b|.Let be s, r ∈ F computed by

1 s = a + b ;2 e = s − a ;3 g = s − e ;4 h = g − a ;5 f = b − h ;6 r = f − e ;7 i f ( r + e != f ) {8 s = a ;9 r = b ;

10 }

Thuss + r = a + b

and|r | ≤ ulp(s)

Multiple precision using floating-point - Lauter - TMW 2006 15

Page 28: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - Introduction

Addition: s + r = a + b

Multiplication - similarly: s + r = a · bIs this possible ?

∗a b

s r

The significand of a · b holds on a sum of two FP-numberss + r

How do we compute s and r?

Multiple precision using floating-point - Lauter - TMW 2006 16

Page 29: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - Introduction

Addition: s + r = a + b

Multiplication - similarly: s + r = a · b

Is this possible ?∗

a b

s r

The significand of a · b holds on a sum of two FP-numberss + r

How do we compute s and r?

Multiple precision using floating-point - Lauter - TMW 2006 16

Page 30: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - Introduction

Addition: s + r = a + b

Multiplication - similarly: s + r = a · bIs this possible ?

∗a b

s r

The significand of a · b holds on a sum of two FP-numberss + r

How do we compute s and r?

Multiple precision using floating-point - Lauter - TMW 2006 16

Page 31: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - Introduction

Addition: s + r = a + b

Multiplication - similarly: s + r = a · bIs this possible ?

∗a b

s r

The significand of a · b holds on a sum of two FP-numberss + r

How do we compute s and r?

Multiple precision using floating-point - Lauter - TMW 2006 16

Page 32: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - Introduction

Addition: s + r = a + b

Multiplication - similarly: s + r = a · bIs this possible ?

∗a b

s r

The significand of a · b holds on a sum of two FP-numberss + r

How do we compute s and r?

Multiple precision using floating-point - Lauter - TMW 2006 16

Page 33: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - Introduction

Addition: s + r = a + b

Multiplication - similarly: s + r = a · bIs this possible ?

∗a b

s r

The significand of a · b holds on a sum of two FP-numberss + r

How do we compute s and r?

Multiple precision using floating-point - Lauter - TMW 2006 16

Page 34: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - The easy way

Suppose that the system supports a fused-multiply-and-add (FMA)operation: FMA(a, b, c) = ◦ (a · b + c).Let be a, b ∈ F.Let be s, r ∈ F computed by

1 s = a ∗ b ;2 r = FMA(a , b,− s ) ; // r = ◦(a · b − s)

Thuss + r = a · b

and|r | ≤ ulp(s)

Multiple precision using floating-point - Lauter - TMW 2006 17

Page 35: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - Graphical “proof”

∗a b

a · b

− s = ◦ (a · b)

rCancellation

Multiple precision using floating-point - Lauter - TMW 2006 18

Page 36: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - without FMA

Let be a, b ∈ Fp on p bits

We want s + r = a · b

Let be ah + al = a and bh + bl = b

Clearly a · b = ah · bh + ah · bl + al · bh + al · bl

If ah, al , bh, bl are written on at most p′ bits, all products holdon 2 · p′ bits.

If 2 · p′ ≤ p we can write:

a · b = ah⊗bh + ah⊗bl + al⊗bh + al⊗bl

Since a · b holds on at most 2 · p bits, there will be sufficientcancellation in the summation of the products producing s + r⇒ Use here the exact 2Sum presented before.

How can we compute ah + al = a ?

Multiple precision using floating-point - Lauter - TMW 2006 19

Page 37: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - without FMA

Let be a, b ∈ Fp on p bits

We want s + r = a · bLet be ah + al = a and bh + bl = b

Clearly a · b = ah · bh + ah · bl + al · bh + al · bl

If ah, al , bh, bl are written on at most p′ bits, all products holdon 2 · p′ bits.

If 2 · p′ ≤ p we can write:

a · b = ah⊗bh + ah⊗bl + al⊗bh + al⊗bl

Since a · b holds on at most 2 · p bits, there will be sufficientcancellation in the summation of the products producing s + r⇒ Use here the exact 2Sum presented before.

How can we compute ah + al = a ?

Multiple precision using floating-point - Lauter - TMW 2006 19

Page 38: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - without FMA

Let be a, b ∈ Fp on p bits

We want s + r = a · bLet be ah + al = a and bh + bl = b

Clearly a · b = ah · bh + ah · bl + al · bh + al · bl

If ah, al , bh, bl are written on at most p′ bits, all products holdon 2 · p′ bits.

If 2 · p′ ≤ p we can write:

a · b = ah⊗bh + ah⊗bl + al⊗bh + al⊗bl

Since a · b holds on at most 2 · p bits, there will be sufficientcancellation in the summation of the products producing s + r⇒ Use here the exact 2Sum presented before.

How can we compute ah + al = a ?

Multiple precision using floating-point - Lauter - TMW 2006 19

Page 39: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - without FMA

Let be a, b ∈ Fp on p bits

We want s + r = a · bLet be ah + al = a and bh + bl = b

Clearly a · b = ah · bh + ah · bl + al · bh + al · bl

If ah, al , bh, bl are written on at most p′ bits, all products holdon 2 · p′ bits.

If 2 · p′ ≤ p we can write:

a · b = ah⊗bh + ah⊗bl + al⊗bh + al⊗bl

Since a · b holds on at most 2 · p bits, there will be sufficientcancellation in the summation of the products producing s + r⇒ Use here the exact 2Sum presented before.

How can we compute ah + al = a ?

Multiple precision using floating-point - Lauter - TMW 2006 19

Page 40: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - without FMA

Let be a, b ∈ Fp on p bits

We want s + r = a · bLet be ah + al = a and bh + bl = b

Clearly a · b = ah · bh + ah · bl + al · bh + al · bl

If ah, al , bh, bl are written on at most p′ bits, all products holdon 2 · p′ bits.

If 2 · p′ ≤ p we can write:

a · b = ah⊗bh + ah⊗bl + al⊗bh + al⊗bl

Since a · b holds on at most 2 · p bits, there will be sufficientcancellation in the summation of the products producing s + r⇒ Use here the exact 2Sum presented before.

How can we compute ah + al = a ?

Multiple precision using floating-point - Lauter - TMW 2006 19

Page 41: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Multiplication - without FMA

Let be a, b ∈ Fp on p bits

We want s + r = a · bLet be ah + al = a and bh + bl = b

Clearly a · b = ah · bh + ah · bl + al · bh + al · bl

If ah, al , bh, bl are written on at most p′ bits, all products holdon 2 · p′ bits.

If 2 · p′ ≤ p we can write:

a · b = ah⊗bh + ah⊗bl + al⊗bh + al⊗bl

Since a · b holds on at most 2 · p bits, there will be sufficientcancellation in the summation of the products producing s + r⇒ Use here the exact 2Sum presented before.

How can we compute ah + al = a ?

Multiple precision using floating-point - Lauter - TMW 2006 19

Page 42: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Cut into halves

Let round-to-nearest the current rounding mode in IEEE 754.Let a ∈ Fp with precision p.Let be h, l ∈ Fp computed by

1 c = 2p−k + 1 ;2 y = c ∗ a ;3 z = y − a ;4 h = y − z ;5 l = a − h ;

��������

��������

2p−k · a

a

(2p−k + 1) · a

y = ◦((2p−k + 1) · a)

a−

z = ◦(y − a)

y = ◦((2p−k + 1) · a)

h = y − z

a−

l = a − h

Thush + l = a

and h has at least k trailing zeros and l has at least p − k + 1trailing zeros.

Multiple precision using floating-point - Lauter - TMW 2006 20

Page 43: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Other exact operations

Division and square root

One can express only the backward error

s = f (a− δ)

instead ofs + δ = f (a)

as for addition and multiplication

Division:

d =a− r

bwhere d = a� b ∈ F and r ∈ F

Square root:s =

√a− r

where s = ◦(√

a)

and r ∈ FWe can implement division and square root on expansionseven with backward errors

Multiple precision using floating-point - Lauter - TMW 2006 21

Page 44: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Double-double, triple-doubleand expansion arithmetic

Motivation

Exact floating-point arithmetic

Double-double, triple-double and expansion arithmetic

Multiple precision using floating-point - Lauter - TMW 2006 22

Page 45: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Vocabulary

Represent high precision numbers as unevaluated sums offloating-point numbers

x =n∑

i=1

xi

Suppose native precision to be IEEE 754 double precision

n = 2: “double-double” – ≈ 102 bits of accuracyn = 3: “triple-double” – ≈ 150 bits of accuracyn = 4: Bailey: “quad-double”any n: expansions

Multiple precision using floating-point - Lauter - TMW 2006 23

Page 46: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Operations on expansions

Operations on expansions:

Addition – Use 2Sum algorithm for carriesMultiplication – Partial products using 2Mult, sum up using2SumDivision – Euclid’s division using an exact backward errorsequence or Newton’s methodSquare root – Newton’s methodRenormalization – use 2Sums and tests for bringingexpansions to a non-overlapping form

Cost:

No conversions between floating-point and integer ⇒double-double and triple-double is much fasterExpansions are inefficient: the exponents are redundantinformationFloating-point arithmetic has some bizarre behaviours: ⇒general expansions seem to be more expensive than integerbased methods because of a high number of tests

Multiple precision using floating-point - Lauter - TMW 2006 24

Page 47: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Operations on expansions

Operations on expansions:

Addition – Use 2Sum algorithm for carriesMultiplication – Partial products using 2Mult, sum up using2SumDivision – Euclid’s division using an exact backward errorsequence or Newton’s methodSquare root – Newton’s methodRenormalization – use 2Sums and tests for bringingexpansions to a non-overlapping form

Cost:

No conversions between floating-point and integer ⇒double-double and triple-double is much fasterExpansions are inefficient: the exponents are redundantinformationFloating-point arithmetic has some bizarre behaviours: ⇒general expansions seem to be more expensive than integerbased methods because of a high number of tests

Multiple precision using floating-point - Lauter - TMW 2006 24

Page 48: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Double-double and triple-double in crlibm

Full implementation of double-doubleVersions for 2Sum and 2Mult optimized for different processors(FMA, FABS, . . . )All combinations double + double, double-double + doubleetc.Accuracy proof for each operator; proof can already beformally verified (Gappa)

Almost complete implementation of triple-doubleBased on double-doubleAlmost all combinations double, double-double or triple-doublein operand or resultAccuracy proof of each operatorApproach for avoiding renormalizations whilst being rigurousNo branches on common machinesCorrect (IEEE 754) rounding to double implemented

Automatic routines for generating double, double-double andtriple-double code for evaluating complete polynomials inHorner’s scheme with formal proof generation

Multiple precision using floating-point - Lauter - TMW 2006 25

Page 49: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Double-double and triple-double in crlibm

Full implementation of double-doubleVersions for 2Sum and 2Mult optimized for different processors(FMA, FABS, . . . )All combinations double + double, double-double + doubleetc.Accuracy proof for each operator; proof can already beformally verified (Gappa)

Almost complete implementation of triple-doubleBased on double-doubleAlmost all combinations double, double-double or triple-doublein operand or resultAccuracy proof of each operatorApproach for avoiding renormalizations whilst being rigurousNo branches on common machinesCorrect (IEEE 754) rounding to double implemented

Automatic routines for generating double, double-double andtriple-double code for evaluating complete polynomials inHorner’s scheme with formal proof generation

Multiple precision using floating-point - Lauter - TMW 2006 25

Page 50: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Double-double and triple-double in crlibm

Full implementation of double-doubleVersions for 2Sum and 2Mult optimized for different processors(FMA, FABS, . . . )All combinations double + double, double-double + doubleetc.Accuracy proof for each operator; proof can already beformally verified (Gappa)

Almost complete implementation of triple-doubleBased on double-doubleAlmost all combinations double, double-double or triple-doublein operand or resultAccuracy proof of each operatorApproach for avoiding renormalizations whilst being rigurousNo branches on common machinesCorrect (IEEE 754) rounding to double implemented

Automatic routines for generating double, double-double andtriple-double code for evaluating complete polynomials inHorner’s scheme with formal proof generation

Multiple precision using floating-point - Lauter - TMW 2006 25

Page 51: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Speed-ups

Logarithm - evaluate polynomials of degree about 12− 20

Library cycles

MPFR - integer based multiprec. 12942

crlibm portable using integer based multiprec. 2748

crlibm portable using triple-double 266

Exponential - evaluate polynomials of degree about 7− 15

Library cycles

MPFR - integer based multiprec. 4908

crlibm portable using integer based multiprec. 1976

crlibm portable using triple-double 258

Multiple precision using floating-point - Lauter - TMW 2006 26

Page 52: A survey of multiple precision computation using floating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using floating-point - Lauter - TMW 2006 1. Project

Conclusion

Presentation of exact floating-point arithmetic

Overview over general techniques for expansions

Double-double and triple-double are quite efficient

No branches neededNo conversions neededSpeed-up of a factor of about 10

Rigourous proofs are possible (Gappa)

General expansion algorithms known but rarely implemented

Multiple precision using floating-point - Lauter - TMW 2006 27