SQLxD Part 5 — Generating columns

This is the fifth part of the SQLxD series. For your convenience you can find other parts in the table of contents in Part 1 – XML Transformation

We already have the model, can parse XML document, and know how to refer to specific node in it. It is time to transform node into row.

As described in XML Transformation we map single node (optionally containing nested nodes) into single row (just like in SQL database). For that, we flatten all nested properties and choose the first value for each of them. We store result in the following manner:

Each row has cells and guids. Cells represent column values, whereas guids represents row origin.

What is origin? We have notion of natural join in our database. In typical SQL databases natural join uses columns with the same names in both tables as a predicate. However, in SQLxD natural join means connecting by parent-child relationship. Since we are allowed to rename columns when performing selection we cannot rely on mechanism comparing only column names (even using some fancy logic with stripping prefixes etc). However, during parsing we assign guid to each node so we can store them in rows to remember row’s origin.

Rest of the model is as follows:

Building rows

Below goes the code for creating row. First, we extract all column names in order to find common schema for all resulting rows (since different documents might have different set of values we need to insert nulls in case of missing value). Next, we create rows.

That’s all. Now we have our nodes transformed into rows so we can start implementing query logic. Most of the operations described in next parts will be rather straightforward — after all they are just relational operations.