Query Execution

Overview

A query execution plan:

Consists of a collection of RelOps.
Executing together to produce a set of result tuples.

Results may be passed from one operator to the next:

Materialization: writing results to disk and reading them back.
Pipelining: generating and passing via memory buffers.

Materialisation

Steps in materialisation between two operators:

First operator reads input(s) and writes results to disk.
Next operator treats tuples on disk as its input.
In essence, the temporary tables are produced as real tables.

Advantage:

Intermediate results can be placed in a file structure (which can be chosen to speed up execution of subsequent operators).

Disadvantage:

Requires disk space and read/write for intermediate results.

Pipelining

How pipelining is organised between two operators:

Operators execute “concurrently” as producer/consumer pairs.
Structured as interacting iterators (open; while(next); close).

Advantage:

No requirement for disk access (results passed via memory buffers).

Disadvantages:

Higher-level operators access inputs via linear scan, or
Requires sufficient memory buffers to hold all outputs.

Pipelining Example

select s.id, e.course, e.mark
from   Student s, Enrolment e
where  e.student = s.id and
       e.semester = '05s2' and s.name = 'John';

Evaluated via communication between RA tree nodes:

Disk Accesses

Pipelining cannot avoid all disk accesses.

Some operations use multiple passes (e.g. merge sort, hash-join).

Data is written by one pass, read by subsequent passes.

Thus,

Within an operation, disk reads/writes are possible.
Between operations, no disk reads/writes are needed.

Example PostgreSQL Execution

Note: both left and right child nodes can execute concurrently before the result is combined.