Advanced Concatenate with Sqlite3

Last updated on October 28, 2022 pm

Advanced Concatenate with Sqlite3

CONCATENATE() is a useful function which can be found in Excel, numpy, etc. It could be simply interpreted as adding two or more strings together. The concatenate operator || could be used to add strings in SQL, but only use || can not satisfy us in much complicated context, i.e., string separated across columns. But as a powerful database manage system, Sqlite3 (and probably other DBMS) allows us to do this advanced CONCATENATE, even though there is no keyword or function written as CONCATENATE in SQL.

Quick Sample

Create Sample Table

1
2
3
4
CREATE TABLE IF NOT EXISTS Alphabet (
ID INTEGER PRIMARY KEY,
Letter CHAR NOT NULL
);

And insert values as follows:

1
2
3
4
INSERT INTO Alphabet (Letter)
VALUES
('A'), ('B'), ('C'), ('D'), ('E'),
('F'), ('G'), ('H'), ('I'), ('J');

So if now we select all entries from Alphabet,

1
2
3
4
5
.head on

.mode column -- make output more readable

SELECT * FROM Alphabet;

the output should look like this:

1
2
3
4
5
6
7
8
9
10
11
12
ID  Letter
-- ------
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
9 I
10 J

Requirement

Print a single string containing all the letters in ascending order, and letters are separated by one comma and one space.

Implement

So by means of CONCATENATE, we want to output some stuff like A, B, C, D, E, F, G, H, I, J. Since Letters are stored in column, it does not work if we only use concatenate operator ||.

The requirement sounds difficult, but if you see codes below, there is nothing too mysterious actually.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
WITH CTE AS (
SELECT
ID,
Letter
FROM
Alphabet
WHERE
ID = 1

UNION

SELECT
Alphabet.ID,
CTE.Letter || ', ' || Alphabet.Letter
FROM
Alphabet
JOIN
CTE ON Alphabet.ID = CTE.ID + 1
)

SELECT
Letter
FROM
CTE
ORDER BY
ID DESC
LIMIT 1;

Explanation

Firstly, we need to create a helper table CTE, which stands for Common Table Expression. The first row of CTE is the first row of Alphabet, this is where our story begins.

Next step is simple, we just need to add new lines into it, and for each line, it’s Letter is the concat of previous line (in CTE) and previous letter (in Alphabet).

In line 17, it seems like we JOIN the CTE to itself, and here is explanation. We actually JOIN the newly generated line to the previous line, and you don’t need to worry about ID exceeding boundary. Don’t confuse it with Self-Join. It is another topic and out of scope of this blog.

Note that in line 18, it’s Alphabet.ID = CTE.ID + 1, and this + 1 is very important. Without it, the query goes infinitely, and actually generate something like A, A, A, A, A, A, ... endlessly.

And we use UNION to combine two selected table.

So if you check what CTE really generate, execute query:

1
2
3
4
5
6
WITH CTE AS (
...
-- same as above
)

SELECT * FROM CTE;

And the output is surprisingly clear.

1
2
3
4
5
6
7
8
9
10
11
12
ID  Letter                      
-- ----------------------------
1 A
2 A, B
3 A, B, C
4 A, B, C, D
5 A, B, C, D, E
6 A, B, C, D, E, F
7 A, B, C, D, E, F, G
8 A, B, C, D, E, F, G, H
9 A, B, C, D, E, F, G, H, I
10 A, B, C, D, E, F, G, H, I, J

This output may to large extent help you understand how this query works.

The final step is just to sort them and output the line we want.

Remark

This is just like doing query recursivly, and Sqlite3 started supporting RECURSIVE keyword since version 3.34.0. Read more.

In real implementation, probably there is no consecutive ID and you may think of using a window function ROW_NUMBER() to help you.

Furthermore, this piece of SQL could be re-used every time you want to concatenate items cross different rows. It can be used as a fixed pattern only with minor modification.

Much More Complicated Ones

There are some more complex questions from CMU 15-445/645 :: Intro to Database Systems. In its homework 1, many questions are quite demanding, and Q10 is always linked to this blog’s topic.

2022 Fall Q10

Homework #1 - SQL Question 10 with official solution, and my code can be found here.

2021 Fall Q10

Homework #1 - SQL Question 10 with official solution, and my code can be found here.

Although added many other requirements and combined with other concepts, the essence never got changed.


Advanced Concatenate with Sqlite3
https://lingkang.dev/2022/10/27/concat/
Author
Lingkang
Posted on
October 27, 2022
Licensed under