136 lines
4.6 KiB
Plaintext
136 lines
4.6 KiB
Plaintext
|
|
Complexity
|
|
|
|
Abstract
|
|
Purpose of this document is to describe in a detailed way the
|
|
complexity of relational algebra operations. The evaluation will be
|
|
done on the specific implementation of this program, not on theorical
|
|
lower limits.
|
|
|
|
Latest implementation can be found at:
|
|
https://github.com/ltworf/relational
|
|
|
|
Notation
|
|
Big O notation will be used. Constant values will be ignored.
|
|
|
|
Single letters will be used to indicate relations and letters between
|
|
| will indicate the cardinality (number of tuples) of the relation.
|
|
|
|
Number of tuples can't be enough. For example a relation with one
|
|
touple and thousands of fields, will not take O(1) in general to be
|
|
evaluated. So we assume that relations will have a reasonable and
|
|
comparable number of fields.
|
|
|
|
Then after evaluating the big O notation, an attempt to find more
|
|
precise results will be done, since it will be important to know
|
|
with a certain precision the weight of the operation.
|
|
|
|
1. UNARY OPERATORS
|
|
|
|
Relational defines three unary operations, and they will be studied
|
|
in this section. It doesn't mean that they should have similar
|
|
complexity.
|
|
|
|
1.1 Selection
|
|
|
|
Selection works on a relation and on a python expression. For each
|
|
tuple of the relation, it will create a dictionary with name:value
|
|
where name are names of the fields in the relation and value is the
|
|
value for the specific row.
|
|
We can consider the inner cycle as constant as its value doesn't
|
|
depend on the relation itself but only on the kind of the relation
|
|
(how many field it has).
|
|
Then comes the evaluation. A python expression in truth could do
|
|
much more things than just checking if a>b. Anyway, ssuming that
|
|
nobody would ever write cycles into a selection condition, we have
|
|
another constant complexity for this operation.
|
|
Then, the tuple is inserted in a new relation if it satisfies the
|
|
condition. Since no check on duplicated tuples is performed, this
|
|
operation is constant too.
|
|
|
|
In the end we have O(|n|) as complexity for a selection on the
|
|
relation n.
|
|
|
|
1.2 Rename
|
|
|
|
The rename operation itself is very simple, just modify the list
|
|
containing the name of the fields.
|
|
The big issue is to copy the content of the relation into a new
|
|
relation object, so the new one can be modified.
|
|
|
|
So the operation depends on the size of the relation: O(|n|).
|
|
|
|
1.3 Projection
|
|
|
|
The projection operation creates a copy of the original relation
|
|
using only a subset of its fields. Time for the copy is something
|
|
like O(|n|) where f is the number of fields to copy.
|
|
But that's not all. Since relations are set, duplicated items are not
|
|
allowed. So after extracting the wanted elements, it has to check if
|
|
the new tuple was already added to the new relation. And this brings
|
|
the complexity to O(|n|²).
|
|
|
|
But the projection can also be used to "rearrange" fields, which
|
|
makes no sense in pure relational algebra, but can be usefull to make
|
|
two relations match (in fact it is used internally to make relations
|
|
match if they have the same fields in different order). In this case
|
|
there is no need to check if the tuple already exists, because it is
|
|
assumed that the relation was correct. This gives a complexity of
|
|
O(|n|) in the best case.
|
|
|
|
2. BINARY OPERATORS
|
|
|
|
Relational defines nine binary operations, and they will be studied
|
|
in this section. Since we will deal with two relations per operation
|
|
here, we will call them m and n, and f and g will be the number of
|
|
their fields.
|
|
|
|
2.1 Product
|
|
|
|
Product is a very complex operations. It is O(|n|*|m|).
|
|
Obvious.
|
|
|
|
2.2 Intersection
|
|
|
|
Same as product even if it does a different thing. But it has to
|
|
compare every tuple from n with every tuple from m, to see if they
|
|
match, and in this case, put them in the resulting relation.
|
|
So this operation is O(|n|*|m|) as well.
|
|
|
|
2.3 Difference
|
|
|
|
Same as above:
|
|
|
|
2.4 Union
|
|
|
|
This operation first creates a new relation copying all the tuples
|
|
from one of the originating relations, then compares them all with
|
|
tuples from the other relation, and if they aren't in, they will be
|
|
added.
|
|
In fact it is same as above: O(|n|*|m|)
|
|
|
|
2.5 Thetajoin
|
|
|
|
This operation is the combination of a product and a selection. So it
|
|
is O(|n|*|m|) too.
|
|
|
|
2.6 Outer
|
|
|
|
This operation is the union of the outer left and the outer right
|
|
join. Makes it O(|n|*|m|) too.
|
|
|
|
2.7 Outer_left
|
|
|
|
O(|n|*|m|), very depending on the number of the fields, because they
|
|
are compared.
|
|
|
|
2.8 Outer_right
|
|
|
|
Mirror operation of outer_lef
|
|
|
|
2.9 Join
|
|
|
|
Same as above.
|
|
|
|
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|