Query Translation
Overview
Converts SQL statement text to RA expression.
Processes:
- Lexer/parser.
- Mapping rules.
- Rewriting rules.
Mapping from SQL to RA may include some optimisations.
Parsing SQL
Parsing task is similar to that for programming languages.
Expression Rewriting Rules
Since RA is a well-defined formal system:
- There exist many algebraic laws on RA expressions.
- Which can be used as a basis for expression rewriting.
- In order to produce equivalent (potentially more efficient) expressions.
Expression transformation based on such rules can be used:
- To simplify/improve SQL to RA mapping results.
- To generate new plan variations to check in query optimisation.
Relational Algebra Laws
Commutative and Associative Laws:
R ⨝ S = S ⨝ R, (R ⨝ S) ⨝ T = R ⨝ (S ⨝ T)
(natural join).R ∪ S = S ∪ R, (R ∪ S) ∪ T = R ∪ (S ∪ T)
(normal set algebra rules).R ⨝_Cond S = S ⨝_Cond R
(theta join).σ_c(σ_d(R)) = σ_d(σ_c(R))
.
Selection splitting:
σ_c∧d(R) = σ_c(σ_d(R))
.σ_c∨d(R) = σ_c(R) ∪ σ_d(R)
.
Selection pushing:
σ_c(R ∪ S) = σ_c(R) ∪ σ_c(S)
.σ_c(R ∩ S) = σ_c(R) ∩ σ_c(S)
.
Selection pushing with join:
σ_c(R ⨝ S) = σ_c(R) ⨝ S
(ifc
refers only to attributes fromR
).σ_c(R ⨝ S) = R ⨝ σ_c(S)
(ifc
refers only to attributes fromS
).- If condition contains attributes from both
R
andS
:σ_c′∧c″ (R ⨝ S) = σ_c′(R) ⨝ σ_c″(S)
.- Where
c'
contains onlyR
attributes andc''
contains onlyS
attributes.
Projection:
- All but last project can be ignored:
π_L1(π_L2(...π_Ln (R))) ==> π_L1(R)
.
- Projections can be pushed into joins:
π_L(R ⨝c S) = π_L(π_M(R) ⨝c π_N(S))
- Where
M
andN
must contain all attributes needed forc
andM
andN
must contain all attributes used inL
(L ⊆ M ∪ N
).
Query Rewriting
Subqueries are converted to joins when possible.
PostgreSQL Views
In PostgreSQL, views are implemented via rewrite rules.
- A reference to view in SQL expands its definition in RA.
For example:
create view COMP9315studes as
select stu,mark from Enrolments where course='COMP9315';
-- students who passed
select stu from COMP9315studes where mark >= 50;
is represented as:
COMP9315studes
= Proj[stu,mark](Sel[course=COMP9315](Enrolments))
-- with query ...
Proj[stu](Sel[mark>=50](COMP9315studes))
-- becomes ...
Proj[stu](Sel[mark>=50](
Proj[stu,mark](Sel[course=COMP9315](Enrolments)))
)
-- which could be rewritten as ...
Proj[stu](Sel[mark>=50 & course=COMP9315]Enrolments)