This user guide is under construction.
This user guide should help a novice MEDOR user get familiar with the MEDOR concepts. More detailed documentation is available from the MEDOR JavaDoc.
MEDOR is a query framework. Its goal is the expression, optimization, compilation and evaluation of queries on data coming from distributed heterogeneous data sources.
Even though it can be used directly by a final user, it is intended to be used for building middleware which require query capacities. MEDOR personalities include EJB 2.0, with the JOnAS EJB server, and JDO, with the Speedo implementation.
MEDOR is in particular integrated with the persistence framework JORM. With regards to JORM, it is meant to be a complement for querying data made persistent through JORM.
The MEDOR user typically performs the following tasks:
This is done using a programmatic algebra. A MEDOR query is represented as a query tree, which implements the QueryTree interface.
MEDOR can optimize the QueryTree using various optimization rules, and produces an optimized QueryTree.
This optimized QueryTree is compiled prior to evaluation. This compilation may involve various steps such as indexes generation, bytecode generation, etc.
The evaluation of a query returns a TupleCollection, which gives access to the query results in a way similar to a JDBCResultSet.
MEDOR itself considers typed structured data. The typed structure is represented by a TupleStructure (org.objectweb.medor.api.TupleStructure), which is composed of a list of named and typed Fields (org.objectweb.medor.api.Field).
A TupleStructure is typically associated to a QueryTree (the basic structure in a query) or to a TupleCollection (the result of query evaluation).
In the case of JORM classes, the JORM meta schema implicitly defines corresponding MEDOR TupleStructures.
A MEDOR query is not expressed using a query language, but progammatically using an algebra.
The main interface is QueryTree. In general, a QueryTree is an abstraction which represents data in the form of Tuples. As such, each QueryTree has an associated TupleStructure.
The QueryTree interface is specialized in QueryNode and QueryLeaf (in org.objectweb.medor.query.api). A QueryLeaf encapsulates data typically coming from an underlying data store, whereas a QueryNode represents an algebraic operation working on data represented by other QueryTrees.
A QueryTree has an associated TupleStructure, representing the fields (the data typing information) of this QueryTree (see Defining fields).
The leaves represent data available either directly as memory data, or encapsulate access to an underlying data store.
An example of a memory data leaf is the MedorTCQueryLeaf (org.objectweb.medor.query.lib.MedorTCQueryLeaf).
Leaves accessing relational data stores are the RdbExpQueryLeaf and RdbStringQueryLeaf (in org.objectweb.medor.query.rdb.lib). RdbStringQueryLeaves allow the encapsulation of a SQL query expressed as a String. They are terminal in the sense that MEDOR will not attempt to rewrite such an RdbStringQueryLeaf in its optimization process. Instead, RdbExpQueryLeaves allow the encapsulation of a SQL query expressed in an abstract manner using QualifiedTables and Expressions. They is optimizable: several RdbExpQueryLeaves working on the same DataStore (in org.objectweb.medor.datasource.api) can be grouped by the optimizer, and transformed into a single RdbStringQueryLeaf for the evaluation; through this optimization process, MEDOR can delegate part of the query evaluation to the underlying database.
Other means of accessing existing data are the JormExtent and GenClassExtent (in org.objectweb.medor.query.jorm.lib). They represent data coming from JORM classes and GenClasses. Several constructors exist for JormExtent, leading to a QueryLeaf with either all Fields of the associated JORM class, only the PName, or some chosen fields.
A QueryNode can be built on one or several other QueryTrees (which are either QueryLeaves or QueryNodes), called its children. Depending on the type and constructor of the QueryNode, the children either explicit in the constructor of the QueryNode, or are implicitly derived from other parameters in the constructors, or from later added Propagated- or ConstructedFields, or from the filter Expression associated to the QueryNode (see table below).
A QueryNode defines its fields. Those fields are typically projected from existing fields of the children QueryTrees, or can be calculated (see Defining fields).
A QueryNode typically has an associated filter. This filter is used to define the conditions under which data from underlying children QueryTrees are considered as data of the current QueryNode.
Doing the parallel with SQL, the children represent the "FROM" part of the query, the filter represents the "WHERE" part of the query, and the fields represent the "SELECT" part of the query.
There are several predefined types of QueryNodes (in org.objectweb.medor.query.lib).
Node | Description | Children |
Cartesian | Cartesian product between tuples of two QueryTrees. The evaluation result will return all tuples which are the concatenation of one tuple from the left QueryTree with one tuple from the right QueryTree. | explicit in constructor |
Intersection | Represents the intersection of two QueryTrees. The evaluation result will contain only those tuples which are in both QueryTrees. Both QueryTrees must have compatible TupleStructure. | explicit in constructor |
JoinProject | Represents a join and project operation between two QueryTrees. The join condition is expressed as a filter. The Fields to project are described using the addPropagatedField and addCalculatedField methods (see Defining fields). | derived from added fields and filter |
Nest | Nesting operation. This is equivalent to the GROUP BY statements of SQL. See the Nest and Unnest section for more details. | derived from fields in constructor |
Project | Simple projection. It only works with one underlying QueryTree. The Fields to project are described using the addPropagatedField and addCalculatedField methods (see Defining fields). | derived from added fields and filter |
SelectProject | Simple projection with a Filter. It is the extension of Project with an associated filter. | derived from added fields and filter |
Union | Union of two QueryTrees. Like for Intersection, both QueryTrees must have compatible TupleStructure. | explicit in constructor |
Unnest | Unnesting operation. This is the reverse of the Nest operation. See the Nest and Unnest section for more details. | derived from fields in constructor |
In the case of JORM classes, MEDOR also provides a utility class, NavigatorNodeFactory (in org.objectweb.medor.query.lib), to ease the construction of QueryNodes performing navigation operations. The QueryTree produced by the NavigatorNodeFactory methods can be constructed "by hand"; the factory is provided as a facility. Three methods are available, which take a NavigatorOperator as input. In all cases, the result is a QueryTree, which contains all fields of the QueryTree hosting the start Field of the path expression, and the field reached by the end of the path expression. If this latter field is a GenClass, the GenClass element is projected.
Filters are expressed using Expressions (in org.objectweb.medor.filter.api). Expressions are constructed from Operators and Operands.
MEDOR comes with a set of predefined operators and operands, as described in the figure below.
We now detail the ParameterOperand and the FieldOperand.
A ParameterOperand allows the use of a variable for which the value in unknown at the time the query is being constructed. The actual value of the parameter will be passed as an argument to the MEDOR evaluator. This value will be used at evaluation time.
A FieldOperand represents a pointer to an existing Field. For a given QueryNode, the associated filter may not manipulate FieldOperands built on Fields of its own TupleStructure, but only on Fields of other QueryTrees. One noticeable exception is that of RdbExpQueryLeaves, where the associated Expression manipulates fields of the associated RdbExpQueryLeaf.
A NavigatorOperator is used in the JORM context to express a path traversal. A path traversal can be represented by f1.f2...fn, where f1, ..., fn-1 are fields of type reference to another class, and fn can either be a single-valued field (whether a RefClass or a primitive type), or a reference to a GenClass. An example of a path traversal can be department.manager.projects, representing the projects under the responsibility of the department manager.
QueryTree fields can be of different nature. For QueryLeaves, the fields are defined when creating the leave. For QueryNodes, a field can be defined either as:
For JoinProject, Project and SelectProject, the Fields which are present in the resulting QueryNode are defined using the addPropagatedField and addCalculatedField methods.
In the case of Union, Cartesian and Intersection, the fields are implicitly defined.
For Nest and Unnest, see the next section.
Nest produces a new QueryTree where some fields are grouping fields (the GROUP BY statement): for all tuples sharing the same value of those grouping fields, the remaining fields are grouped into a new Field of type TupleCollection.
The following figure describes a simple example with two QueryLeaves, emp and dept. The TupleStructure of emp contains fields named name and dept, that of dept fields named deptName and town. QueryNode worksIn is a JoinProject, built upon those two QueryLeaves. Its TupleStructure has two PropagatedFields, named name and town, respectively coming from name of emp and town of dept. QN also has a filter, which specifies that dept of emp should be equal to deptName of dept.
The tables below shows the result of evaluating QN given data for QL1 and QL2.
|
|
|
At optimization time, MEDOR uses a QueryRewriter (in org.objectweb.medor.optim.api) to transform the input QueryTree, applying various RewriteRules. The order in which the rules are applied is determined by a RuleConfiguration (in org.objectweb.medor.optim.api). For QueryLeaves, the LeafRewriteRule interface extends the RewriteRule interface, and relies on LeafRewriters which are capable of rewriting certain types of QueryLeaves.
The rules are responsible for various types of optimization. Below is the current list of rewrite rules, in the typical order in which they should be applied (this order is defined in the BasicQueryRewriter (org.objectweb.medor.optim.lib) and its extension, the JormQueryRewriter (in org.objectweb.medor.optim.jorm).
rule | package | description |
PushNotInExpressionRule | org.objectweb.medor.optim.lib | Manipulates filters only (no changes to the QueryTree structure). Pushes negations inside the expression. For example, !(a > b) is replaced by (a <= b) |
PushSelectionRule | org.objectweb.medor.optim.lib | Pushes a selection as far down as possible into its children nodes. The goal is to perform a selection on a Field as early as possible, i.e., as low in the QueryTree as possible. |
DropUnusedProjFieldsRule | org.objectweb.medor.optim.lib | Drops fields which are projected in intermediary QueryNodes but which are not used by the parent QueryNode. |
DropUselessNodeRule | org.objectweb.medor.optim.lib | Eliminates nodes which do not do any selection (no filter), and for which all fields are propagated fields with one ancestor only. Such nodes may appear after the previous PushSelection and DropUnusedProjFields rules have been applied. |
GroupSameDBRule | org.objectweb.medor.optim.rdb | For RdbExpQueryLeaves, groups together as a single RdbExpQueryLeaf all leaves which operate on the same DataStore. |
JormAssignMapperRule | org.objectweb.medor.optim.jorm | Assigns a mapper to each JORM class. |
JormLeafRewriterRule | org.objectweb.medor.optim.jorm | Rewrites JORM leaves into the corresponding QueryLeaves on the underlying data store. For example, if a JORM class is mapped to a relational database, the corresponding class extent will be rewritten into an RdbExpQueryLeaf. |
JormGoUpDecodeRule | org.objectweb.medor.optim.jorm | Moves the construction of the PName as far up in the QueryTree as possible. The reason behind this rule is as follows. When transforming JORM extents into QueryLeaves, JORM PNames are replaced with the construction of the PName using the CompositePName and SinglePName operators (in org.objectweb.medor.filter.jorm.lib). By moving the construction of the PName up in the QueryTree, and replacing PName comparisons with the comparision of the underlying Fields, this will allow the delegation of larger parts of a QueryTree to the underlying data store. |
Note that the BasicQueryRewriter (in org.objectweb.medor.optim.lib), which is both a QueryRewriter and a RuleConfiguration, contains a predefined combination of rules.
[TODO: add an example here showing a complete rewrite, or point to a complete example, to be written...]
The result of a query evaluation is a TupleCollection (org.objectweb.medor.tuple.api.TupleCollection). It represents a collection of structured typed Tuples (org.objectweb.medor.tuple.api.Tuple). This typed structure is represented by a TupleStructure (org.objectweb.medor.api.TupleStructure).
This manual dated 11 October 2002
Author: Alexandre Lefebvre