Overview

A query execution plan:

  • Consists of a collection of RelOps.
  • Executing together to produce a set of result tuples.

Results may be passed from one operator to the next:

  • Materialization: writing results to disk and reading them back.
  • Pipelining: generating and passing via memory buffers.

Materialisation

Steps in materialisation between two operators:

  • First operator reads input(s) and writes results to disk.
  • Next operator treats tuples on disk as its input.
  • In essence, the temporary tables are produced as real tables.

Advantage:

  • Intermediate results can be placed in a file structure (which can be chosen to speed up execution of subsequent operators).

Disadvantage:

  • Requires disk space and read/write for intermediate results.

Pipelining

How pipelining is organised between two operators:

  • Operators execute “concurrently” as producer/consumer pairs.
  • Structured as interacting iterators (open; while(next); close).

Advantage:

  • No requirement for disk access (results passed via memory buffers).

Disadvantages:

  • Higher-level operators access inputs via linear scan, or
  • Requires sufficient memory buffers to hold all outputs.

Pipelining Example

select s.id, e.course, e.mark
from   Student s, Enrolment e
where  e.student = s.id and
       e.semester = '05s2' and s.name = 'John';

Evaluated via communication between RA tree nodes:

Disk Accesses

Pipelining cannot avoid all disk accesses.

Some operations use multiple passes (e.g. merge sort, hash-join).

  • Data is written by one pass, read by subsequent passes.

Thus,

  • Within an operation, disk reads/writes are possible.
  • Between operations, no disk reads/writes are needed.

Example PostgreSQL Execution

Note: both left and right child nodes can execute concurrently before the result is combined.