SPLIT

A SPLIT clause contains a query, called a split query.   The SPLIT clause works within a SELECT statement to create rows that are permutations between the results of the SPLIT query and the results of the main SELECT query without the SPLIT.  Adding permutations increases the number of rows in the results table of the SELECT as compared to how many rows there would be without the SPLIT clause.

 

The name split can be confusing because split means to break apart into pieces, which is not what SPLIT does.  SPLIT does not break apart anything.  Instead, it multiplies the number of rows by combining results from the split query in permutations with results from the SELECT.   A clearer, albeit clunkier, name might have been PERMUTE or FOR_EACH

 

SELECT <field> /SPLIT <query>, ... FROM <table>

 

A SELECT query creates a results table with some number of rows.    A SPLIT query also creates a results table with some number of rows.   Using a SPLIT clause within a SELECT creates a results table that includes rows for all possible combinations of results between the SPLIT query and the SELECT query.  If the SELECT query without the SPLIT would have created 5 rows, and the SPLIT query creates 3 rows, the results table using SPLIT within the SELECT would be 15 rows.  

 

When SELECT does its work, evaluating the values of <field> for each record in the FROM table, for each such record it evaluates the SPLIT <query> and creates one or more new result records as generated by the query.   The ultimate results table will have rows for all permutations of the SELECT results and the SPLIT <query> results.   

 

 

Tutorial Example

Let us take a look at SPLIT in action.

 

il_split02_01.png

 

We have a table called Dishes with three rows, perhaps representing pasta dishes on a restaurant menu.

 

il_split02_02.png

 

The simple SELECT query:

 

SELECT

   [Pasta]

   FROM [Dishes];

 

... creates a results table with three rows, one row each for Spaghetti, Fettuccine, and Linguine.

 

SQL beginners sometimes think that all queries must begin with a SELECT, but there are many other forms of queries.  For example, we can write a simple query using the VALUES statement:

 

VALUES ('Tomato'), ('Cream'), ('Pesto');

 

il_split02_03.png

 

The VALUES query simply generates a results table with three rows, where each row contains a result field that is the string literal we specified.    To keep things as simple as possible for this example, we will use the VALUES query above as our split query.  Perhaps this VALUES query represents a choice of sauces in a restaurant that can be used for any pasta dish.

 

il_split02_04.png

 

The syntax for using SPLIT within our SELECT query is:

 

SELECT

   [Pasta],

   SPLIT <query>

   FROM [Dishes];

 

In the above, we replace <query> with our split query as follows:

 

il_split02_05.png

The full query seen above is:

 

SELECT

   [Pasta],

   SPLIT (VALUES ('Tomato'), ('Cream'), ('Pesto'))

   FROM [Dishes];

 

The SELECT query by itself without the SPLIT created three rows in the results table.   The VALUES query by itself created three rows in the results table.   The SELECT query with the SPLIT within, using the VALUES query as the split query, creates a results table that is all combinations, nine rows in all, of the results of the SELECT query together with the results of the SPLIT query.  This shows all the combinations of pasta dishes and sauces that can be ordered in the restaurant.

 

il_split02_11.png

 

Looking at the results table we can see why people often explain SPLIT as something that "expands" or "explodes" single records into groups of records, somewhat like the reverse of how a GROUP BY clause collapses groups of records into one record.   In the above results table what would have been just one record, for example, the Spaghetti record, in the SELECT has been expanded by the SPLIT into three records, Spaghetti with Tomato sauce, Spaghetti with Cream sauce, and Spaghetti with Pesto sauce.

 

That is one way to look at what SPLIT does, but it does not highlight the permutational, all-possible-combinations action of SPLIT, which becomes even more evident if we have more than one SPLIT clause within a SELECT.     We can see that by adding another SPLIT to our example, to incorporate permutations from a choice of toppings for our pasta dishes.

 

il_split02_06.png

Consider another simple VALUES query:

 

VALUES ('Cheese'), ('Truffle');

 

That query creates a results table with two rows, one for Cheese and one for Truffle.   Perhaps this VALUES query represents a choice of toppings that can be grated onto any pasta dish in the restaurant.

 

We can grow our example SELECT by adding a second SPLIT clause that uses the above VALUES query as a split query.

 

il_split02_07.png

 

Our SELECT query now reads:

 

SELECT

   [Pasta],

   SPLIT (VALUES ('Tomato'), ('Cream'), ('Pesto')),

   SPLIT (VALUES ('Cheese'), ('Truffle'))

   FROM [Dishes];

 

The results table now includes 18 rows, comprising all permutations of the results of three queries: the three results of the SELECT query by itself, the three results of the Tomato/Cream/Pesto split query and the two results of the Cheese/Truffle split query.   There are 18 rows because 3 * 3 * 2 = 18.   The results table shows all the possible orders in the restaurant, from Spaghetti with Tomato sauce and Cheese grated on top to Linguine with Pesto sauce and Truffle grated on top.

 

As becomes more obvious by adding more SPLIT clauses to the SELECT, the SPLIT clauses are not breaking anything apart into pieces but instead they are adding more permutations.

 

Note that the permutations are coming from the SPLIT clauses and do not arise from more fields added to the SELECT query, which returns the same number of rows in the results table even if more fields are added to the SELECT.

il_split02_08.png

Suppose we add an mfd_id field to our simple SELECT query using the Dishes table.

 

il_split02_09.png

We can now write the SELECT query:

 

SELECT

   [mfd_id], [Pasta]

   FROM [Dishes];

 

That query still generates only three rows.  It just adds an additional column to the results table for the mfd_id field.

 

If we add the mfd_id field into the SELECT query that used two SPLIT queries the number of rows returned also will not increase.

 

il_split02_10.png

The query:

 

SELECT

   [mfd_id], [Pasta],

   SPLIT (VALUES ('Tomato'), ('Cream'), ('Pesto')),

   SPLIT (VALUES ('Cheese'), ('Truffle'))

   FROM [Dishes];

 

Still generates 18 rows, since the SELECT query by itself still returns only three results while the SPLIT queries still return three and two results respectively.

 

The above example has been kept deliberately simple to make clear how SPLIT creates permutations.   In real life SPLIT tends to be used not with simple VALUES queries like those above, but with much more sophisticated split queries.   

 

SELECT [Name], Count(*), SPLIT (COLLECT [Property], [Value])

  FROM [mfd_meta] GROUP BY [Name];  

 

A real-world example using the gisadvisor.com SQLTRAINING.MAP data set:

 

SELECT parcel, SPLIT (COLLECT building, area ORDER BY area DESC FETCH 3)

FROM

(

  SELECT p.mfd_id AS parcel, b.mfd_id AS building,

    GeomArea(b.[geom (i)], 0) AS area

  FROM [parcels Table] AS p INNER JOIN [buildings Table] AS b

    ON GeomContains(p.[geom (i)], b.[geom (i)], 0)

)

GROUP BY parcel;

 

The inner query reports buildings in each parcel (ignoring buildings that cross parcels) along with their area.  The outer query reports up to the three largest buildings for each parcel.

 

Given a table that lists areas in a flood plain and the maximum flood depth for each area:

 

SELECT [FID_impact],

  SPLIT (COLLECT FloodDepth AS FirstDepth, P_Area ORDER by P_Area DESC FETCH 1)

INTO bobo2

FROM [Floodint Table]

GROUP BY [FID_impact];

 

... returns the largest area and its flood depth:

 

Examples

Splitting into constant number records with fixed split values:

 

SELECT mfd_id,

SPLIT (VALUES ('a'), ('b'), ('c')), name

FROM mfd_root;

 

Splitting according to value in the result record:

 

SELECT mfd_id,

SPLIT CALL ValueSequence(1, mfd_id, 1), name

FROM mfd_root;

 

Splitting according to value in the result record:

 

SELECT mfd_id,

SPLIT CALL ValueSequence(mfd_id+10, mfd_id*2+10, 1), name FROM mfd_root;

 

Multiple SPLITs (the order matters):

 

SELECT mfd_id,

SPLIT CALL ValueSequence(mfd_id+10, mfd_id*2+10, 1),

SPLIT (VALUES ('a'), ('b')), name

FROM mfd_root;

 

SPLIT producing multiple fields:

 

SELECT mfd_id,

SPLIT (VALUES ('a', mfd_id), ('b', mfd_id+1)), name

FROM mfd_root;

 

SPLIT wrapped into a FUNCTION:

 

FUNCTION f(@x INT32) TABLE

AS (VALUES ('a', @x), ('b', @x+1), ('c', @x+2)) END

SELECT mfd_id, SPLIT CALL f(mfd_id), name

FROM mfd_root;

 

See also the COLLECT topic for examples using SPLIT and a discussion of how SPLIT works with COLLECT.

 

Notes

No aliases - SPLIT does not allow aliases.   For example, not allowed is a construction such as:

 

SELECT ... SPLIT <table-expression> AS <name>   -- error

 

Order of precedence - SPLIT is often used with COLLECT and GROUP BY.  The order of precedence is first GROUP, then  aggregates, such as Min or Max or Sum or COLLECT (COLLECT is an aggregate that returns a table) and then finally SPLIT.

 

No brackets - In this topic to reduce visual clutter we do not use square [ ] brackets around simple field and table names.    Square brackets are optional if the name is unambiguous.

 

New and improved - SPLIT has changed significantly from SPLIT BY in pre-Radian Manifold products

 

See Also

Queries

 

Command Window

 

Query Builder

 

SQL Statements

 

COLLECT

 

GROUP BY

 

SPLIT

 

VALUES