55
ADT 2010 ADT 2010 MonetDB/XQuery (2/2): MonetDB/XQuery (2/2): High-Performance, Purely Relational High-Performance, Purely Relational XQuery Processing XQuery Processing http://pathfinder-xquery.org/ http://pathfinder-xquery.org/ http://monetdb-xquery.org/ http://monetdb-xquery.org/ Stefan Manegold [email protected] http://www.cwi.nl/~manegold/

MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 [email protected] MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

ADT 2010ADT 2010

MonetDB/XQuery (2/2):MonetDB/XQuery (2/2):

High-Performance, Purely Relational High-Performance, Purely Relational

XQuery ProcessingXQuery Processing

http://pathfinder-xquery.org/http://pathfinder-xquery.org/

http://monetdb-xquery.org/http://monetdb-xquery.org/

Stefan [email protected]

http://www.cwi.nl/~manegold/

Page 2: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

2

[email protected] MonetDB/XQuery ADT 2010

XPath evaluation (SQL)XPath evaluation (SQL)

Example query: /descendant::open_auction[./bidder]/annotation

SELECT DISTINCT a.pre FROM doc r, doc oa, doc b, doc a WHERE r.pre=0 AND oa.pre > r.pre AND oa.post < r.post AND oa.name = “open_auction” AND oa.kind = “elem” AND b.pre > oa.pre AND b.post < oa.post AND b.level = oa.level + 1 AND b.name = “bidder” AND b.kind < “elem” AND a.pre > oa.pre AND a.post < oa.post AND a.level = oa.level + 1 AND a.name = “annotation” AND a.kind = “elem” ORDER BY a.pre

• (potentially?) expensive joins due to range predicates• (potentially?) expensive duplicate elimination

<- descendant

} child

} child

Page 3: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

3

[email protected] MonetDB/XQuery ADT 2010

Staircase Join Staircase Join [VLDB03][VLDB03]

document

List of context nodes

seek

pre|post are not random numbers: =>exploit the tree properties encoded in them

Page 4: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

4

[email protected] MonetDB/XQuery ADT 2010

document

List of context nodes

scan

Staircase Join Staircase Join [VLDB03][VLDB03]

pre|post are not random numbers: =>exploit the tree properties encoded in them

Page 5: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

5

[email protected] MonetDB/XQuery ADT 2010

document

List of context nodes

skip

Staircase Join Staircase Join [VLDB03][VLDB03]

pre|post are not random numbers: =>exploit the tree properties encoded in them

Page 6: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

6

[email protected] MonetDB/XQuery ADT 2010

document

List of context nodes

seek

seek scan skip seek scan skip ...

Staircase Join Staircase Join [VLDB03][VLDB03]

pre|post are not random numbers: =>exploit the tree properties encoded in them

Page 7: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

7

[email protected] MonetDB/XQuery ADT 2010

• skipping: avoid touching node ranges that cannot contain results

Generate a duplicate-free result in document order • pruning: reduce the context set a-priori• partitioning: single sequential pass over the document

document

List of context nodes

seek

seek scan skip seek scan skip ...

Staircase Join Staircase Join [VLDB03][VLDB03]

Page 8: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

8

[email protected] MonetDB/XQuery ADT 2010

Example:

(c1,c2)/descendant:*

SELECT DISTINCT doc.pre FROM c, doc WHERE doc.pre > c.pre AND doc.post < c.post

Avoid comparing large chunks of the document table.

Staircase Join: SkippingStaircase Join: Skipping

Page 9: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

9

[email protected] MonetDB/XQuery ADT 2010

Example:

(c1,c2,c3,c4)/ancestor:*

SELECT DISTINCT doc.pre FROM c, doc WHERE doc.pre < c.pre AND doc.post > c.post

Eliminate: c1, c3

Keep: c2, c4

Staircase Join: PruningStaircase Join: Pruning

Page 10: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

10

[email protected] MonetDB/XQuery ADT 2010

Example:

(c1,c2,c3)/ancestor:*

SELECT DISTINCT doc.preFROM c, doc WHERE doc.pre < c.pre AND doc.post > c.post

Single-pass algorithm that avoids generating duplicates

Staircase Join: PartitioningStaircase Join: Partitioning

Page 11: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

11

[email protected] MonetDB/XQuery ADT 2010

Example:

(c1,c2,c3)/ancestor:*

SELECT DISTINCT doc.preFROM c, doc WHERE doc.pre < c.pre AND doc.post > c.post

Single-pass algorithm that avoids generating duplicates

Staircase Join: PartitioningStaircase Join: Partitioning

Page 12: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

12

[email protected] MonetDB/XQuery ADT 2010

ScheduleSchedule•So far:

•RDBMS back-end support for XML/XQuery (1/2):

•Document Representation (XPath Accelerator, Pre/Post

plane)

•XPath navigation (Staircase Join)

Page 13: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

13

[email protected] MonetDB/XQuery ADT 2010

ScheduleSchedule•So far

•RDBMS back-end support for XML/XQuery (1/2):

•Document Representation (XPath Accelerator, Pre/Post

plane)

•XPath navigation (Staircase Join)

•Now:

•XQuery to Relational Algebra Compiler:

•Item- & Sequence- Representation

•Efficient FLWoR Evaluation (Loop-Lifting)

•Optimization

•RDBMS back-end support for XML/XQuery (2/2):

•Updateable Document Representation

Page 14: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

14

[email protected] MonetDB/XQuery ADT 2010

Source Language: XQuery CoreSource Language: XQuery Core

Page 15: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

15

[email protected] MonetDB/XQuery ADT 2010

Target Language: Relational AlgebraTarget Language: Relational Algebra

Page 16: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

16

[email protected] MonetDB/XQuery ADT 2010

• sequence = table of items• add pos column for maintaining order• ignore polymorphism for the moment

Sequence RepresentationSequence Representation

Page 17: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

17

[email protected] MonetDB/XQuery ADT 2010

■ Item sequences, sequence order

(10, “x”, <a/>, 10) →

Node Atom Item1 NULL 10 1 10

2 NULL “x” → 2 “X”3 NULL 34 NULL 10 4 10

Pos Pos

pre(a) pre(a)

Item

■ Problems:

● Polymorphic columns

● Redundant storage

● Copy overhead (especially with strings)

Sequence RepresentationSequence Representation

Page 18: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

18

[email protected] MonetDB/XQuery ADT 2010

■ Item sequences, sequence order

(10, “x”, <a/>, 10) →

Node Atom Item1 NULL 10 1 10

2 NULL “x” → 2 “X”3 NULL 34 NULL 10 4 10

Pos Pos

pre(a) pre(a)

Item

Pos Item Pos Item Kind str_values1 10 1 Int “x”2 “X” → 2 Str3 pre(a) 3 Node int_values4 10 4 Int 10

1@0 1@01@00@01@0 1@0

Item RepresentationItem Representation

Page 19: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

19

[email protected] MonetDB/XQuery ADT 2010

IterationsIterations

Page 20: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

20

[email protected] MonetDB/XQuery ADT 2010

IterationsIterations

Page 21: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

21

[email protected] MonetDB/XQuery ADT 2010

IterationsIterations

Page 22: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

22

[email protected] MonetDB/XQuery ADT 2010

IterationsIterations

11..1

Page 23: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

23

[email protected] MonetDB/XQuery ADT 2010

IterationsIterations

Page 24: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

24

[email protected] MonetDB/XQuery ADT 2010

Loop-LiftingLoop-Lifting

Page 25: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

25

[email protected] MonetDB/XQuery ADT 2010

Loop-LiftingLoop-Lifting

Page 26: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

26

[email protected] MonetDB/XQuery ADT 2010

Loop-LiftingLoop-Lifting

Page 27: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

27

[email protected] MonetDB/XQuery ADT 2010

Loop-LiftingLoop-Lifting

Page 28: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

28

[email protected] MonetDB/XQuery ADT 2010

Deriving Deriving looploop

Page 29: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

29

[email protected] MonetDB/XQuery ADT 2010

Nested ScopesNested Scopes

Page 30: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

30

[email protected] MonetDB/XQuery ADT 2010

Nested ScopesNested Scopes

Page 31: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

31

[email protected] MonetDB/XQuery ADT 2010

Loop-liftingLoop-lifting

Page 32: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

32

[email protected] MonetDB/XQuery ADT 2010

Full ExampleFull Example

join calc project

Page 33: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

33

[email protected] MonetDB/XQuery ADT 2010

XQuery On SQL HostsXQuery On SQL Hosts [VLDB04][VLDB04]

XQuery Construct

sequence construction

if-then-else

for-loops

calculations

list functions, e.g. fn:first()

element construction

XPath steps

Relational Mapping

A union B

select(A=true,B) union select(A=false,C)

cartesian product

project(A,x=expr(Y1,..Yn))

select(A,pos=1)

updates in temporary tables

staircase-join

Page 34: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

34

[email protected] MonetDB/XQuery ADT 2010

XMark Query 8XMark Query 8let $auction := doc("auctions.xml")return for $p in $auction/site/people/person let $a := for $t in $auction/site/ closed_auctions/closed_auction where $t/buyer/@person = $p/@id return $treturn <item person="{$p/name/text()}"> {count($a)} </item>

Page 35: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

35

[email protected] MonetDB/XQuery ADT 2010

Peephole OptimizationPeephole Optimization[Grust, XIME-P 2005][Grust, XIME-P 2005]

Page 36: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

36

[email protected] MonetDB/XQuery ADT 2010

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Grust, XIME-P 2005][Grust, XIME-P 2005]

Page 37: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

37

[email protected] MonetDB/XQuery ADT 2010

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Grust, XIME-P 2005][Grust, XIME-P 2005]

Page 38: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

38

[email protected] MonetDB/XQuery ADT 2010

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Grust, XIME-P 2005][Grust, XIME-P 2005]

Page 39: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

39

[email protected] MonetDB/XQuery ADT 2010

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Grust, XIME-P 2005][Grust, XIME-P 2005]

Page 40: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

40

[email protected] MonetDB/XQuery ADT 2010

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Grust, XIME-P 2005][Grust, XIME-P 2005]

Page 41: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

41

[email protected] MonetDB/XQuery ADT 2010

• Plan simplification

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Grust, XIME-P 2005][Grust, XIME-P 2005]

Page 42: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

42

[email protected] MonetDB/XQuery ADT 2010

• Sort Avoidance

generated by DENSE RANK pos ORDER BY pos PARTITION BY iter

by tracking secondary ordering properties [pos|iter], [Wang&Cherniack, VLDB’03]

C43B23A13Z102Y51X41

itempositer

[iter,pos]

Peephole OptimizationPeephole Optimization[Boncz et al., SIGMOD 2006][Boncz et al., SIGMOD 2006]

Page 43: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

43

[email protected] MonetDB/XQuery ADT 2010

• Sort Avoidance

generated by DENSE RANK pos ORDER BY pos PARTITION BY iter

by tracking secondary ordering properties [pos|iter], [Wang&Cherniack, VLDB’03]

C43B23A13Z102Y51X41

itempositer

C33B23A13Z12Y21X11

itempositer

[iter,pos]

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Boncz et al., SIGMOD 2006][Boncz et al., SIGMOD 2006]

Page 44: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

44

[email protected] MonetDB/XQuery ADT 2010

• Sort Avoidance

generated by DENSE RANK pos ORDER BY pos PARTITION BY iter

by tracking secondary ordering properties [pos|iter], [Wang&Cherniack, VLDB’03]

hash-based (streaming) DENSE_RANK

C43B23Y51A13Z102X41

itempositer

C33B23Y21A13Z12X11

itempositer

[pos|iter]

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Boncz et al., SIGMOD 2006][Boncz et al., SIGMOD 2006]

Page 45: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

45

[email protected] MonetDB/XQuery ADT 2010

• Sort Avoidance

generated by DENSE RANK pos ORDER BY pos PARTITION BY iter

by tracking secondary ordering properties [pos|iter], [Wang&Cherniack, VLDB’03]

hash-based (streaming) DENSE_RANK

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Boncz et al., SIGMOD 2006][Boncz et al., SIGMOD 2006]

Page 46: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

46

[email protected] MonetDB/XQuery ADT 2010

• Sort Avoidance

generated by DENSE RANK pos ORDER BY pos PARTITION BY iter

by tracking secondary ordering properties [pos|iter], [Wang&Cherniack, VLDB’03]

hash-based (streaming) DENSE_RANK

• Sort Reduction

if [iter,item] order is required, and [iter] only is present

use a refine-sort rather than a full sort.

pipelinable sort

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Boncz et al., SIGMOD 2006][Boncz et al., SIGMOD 2006]

Page 47: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

47

[email protected] MonetDB/XQuery ADT 2010

• Sort Avoidance

generated by DENSE RANK pos ORDER BY pos PARTITION BY iter

by tracking secondary ordering properties [item|iter], [Wang&Cherniack, VLDB’03]

hash-based (streaming) DENSE_RANK

• Sort Reduction

if [iter,item] order is required, and [iter] only is present

use a refine-sort rather than a full sort.

pipelinable sort

• Join Detection

detect cartesian products as multi-valued dependencies

in the precense of a theta-selection theta-join

Peephole OptimizationPeephole OptimizationPeephole OptimizationPeephole Optimization[Boncz et al., SIGMOD 2006][Boncz et al., SIGMOD 2006]

Page 48: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

48

[email protected] MonetDB/XQuery ADT 2010

Loop-lifted XPath StepsLoop-lifted XPath Steps

Many algorithms have been proposed & studied for XPath evaluation:• Dataguide based, • Structural Join,• Staircase Join, • Holistic Twig Join

IN: sequence of context nodes in (doc order)OUT: sequence of document nodes (unique, in doc order)

Page 49: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

49

[email protected] MonetDB/XQuery ADT 2010

Loop-lifted XPath StepsLoop-lifted XPath Steps

In XQuery, expressions generally occur inside FLWR blocks, i.e. inside a for-loop

for $x in doc()//employee $x/ancestor::department

Choice:• call XPath algorithm N times, accessing document and index structures N times.• use a loop-lifted algorithm:

IN: for each iteration, a sequence of context nodesOUT: for each iteration, a sequence of document nodes (per iteration unique, in doc order)

Page 50: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

50

[email protected] MonetDB/XQuery ADT 2010

Staircase joinStaircase join

document

List of context nodes

Page 51: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

51

[email protected] MonetDB/XQuery ADT 2010

Loop-lifted staircase joinLoop-lifted staircase join

document document

List of context nodes Active stack

Multiple lists of context nodes

Adapt:

pruning, partitioning and skipping rules

to correctly deal with multiple context sets

Page 52: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

52

[email protected] MonetDB/XQuery ADT 2010

Loop-lifted staircase joinLoop-lifted staircase join

Results on the 20 XMark queries:

Page 53: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

53

[email protected] MonetDB/XQuery ADT 2010

Performance EvaluationPerformance Evaluation

Extensive performance Evaluation on XMark• data sizes 110KB, 1MB, 11MB, 110MB, 1.1GB, 11GB• MonetDB/XQuery, Galax0.6, X-Hive 6.0, Berkeley DB XML 2.0, eXisT• 8GB RAM

Extensive XMark performance Literature Overview• IPSI-XQ v1.1.1b , Dynamic Interval Encoding , Kweelt , QuiP, FluX , TurboXPath, Timber, Qizx/Open (Version 0.4/p1), Saxon (Version 8.0), BEA/XQRL, VX• Crude comparison (normalized by CPU SPECint)

Page 54: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

54

[email protected] MonetDB/XQuery ADT 2010

XMark BenchmarkXMark Benchmark1MB XML 1GB XML

Galax

X-Hive

BerkeleyDB XML

eXisT

Page 55: MonetDB/XQuery (2/2): High-Performance, Purely Relational ... · 13 Stefan.Manegold@CWI.nl MonetDB/XQuery ADT 2010 Schedule •So far •RDBMS back-end support for XML/XQuery (1/2):

55

[email protected] MonetDB/XQuery ADT 2010

• http://monetdb.cwi.nl/XQuery/Benchmark/

• S. Manegold: “An Empirical Evaluation of XQuery Processors”, Information Systems, 33:203-220, April 2008. http://repository.cwi.nl/search/fullrecord.php?publnr=13811

More Benchmarks & Performance Results?More Benchmarks & Performance Results?

• http://monetdb.cwi.nl/XQuery/Benchmark/

• S. Manegold: “An Empirical Evaluation of XQuery Processors”, Information Systems, 33:203-220, April 2008. http://www.cwi.nl/htbin/ins1/publications?request=abstract&key=Ma:IS:08

• http://monetdb.cwi.nl/XQuery/Benchmark/

• S. Manegold: “An Empirical Evaluation of XQuery Processors”, Information Systems, 33:203-220, April 2008. http://www.cwi.nl/htbin/ins1/publications?request=abstract&key=Ma:IS:08