relational/complexity

97 lines
3.1 KiB
Plaintext

Complexity
Abstract
Purpose of this document is to describe in a detailed way the
complexity of relational algebra operations. The evaluation will be
done on the specific implementation of this program, not on theorical
lower limits.
Latest implementation can be found at:
http://galileo.dmi.unict.it/svn/trunk
Notation
Big O notation will be used. Constant values will be ignored.
Single letters will be used to indicate relations and letters between
| will indicate the cardinality (number of tuples) of the relation.
Then after evaluating the big O notation, an attempt to find more
precise results will be done, since it will be important to know
with a certain precision the weight of the operation.
1. UNARY OPERATORS
Relational defines three unary operations, and they will be studied
in this section. It doesn't mean that they should have similar
complexity.
1.1 Selection
Selection works on a relation and on a python expression. For each
tuple of the relation, it will create a dictionary with name:value
where name are names of the fields in the relation and value is the
value for the specific row.
We can consider the inner cycle as constant as its value doesn't
depend on the relation itself but only on the kind of the relation
(how many field it has).
Then comes the evaluation. A python expression in truth could do
much more things than just checking if a>b. Anyway, ssuming that
nobody would ever write cycles into a selection condition, we have
another constant complexity for this operation.
Then, the tuple is inserted in a new relation if it satisfies the
condition. Since no check on duplicated tuples is performed, this
operation is constant too.
In the end we have O(|n|) as complexity for a selection on the
relation n.
The assumption made of considering constant the number of fields is
a bit strong. For example a relation could have hundreds of fields
and two tuples.
So in general, the complexity is something more like O(|n| * f) where
f is the number of the fields.
1.2 Rename
The rename operation itself is very simple, just modify the list
containing the name of the fields.
The big issue is to copy the content of the relation into a new
relation object, so the new one can be modified.
So the operation depends on the size of the relation: O(|n| * f).
1.3 Projection
The projection operation creates a copy of the original relation
using only a subset of its fields. Time for the copy is something
like O(|n|*f) where f is the number of fields to copy.
But that's not all. Since relations are set, duplicated items are not
allowed. So after extracting the wanted elements, it has to check if
the new tuple was already added to the new relation. And this brings
the complexity to O((|n|*f)²).
2. BINARY OPERATORS
Relational defines nine binary operations, and they will be studied
in this section.
2.1 Product
2.2 Intersection
2.3 Difference
2.4 Union
2.5 Thetajoin
2.6 Outher
2.7 Outher_left
2.8 Outher_right
2.9 Join