Eight Queens in SQL [3]

The third article in the series.

Links to all previous and this post’s code.
Part Main Topic Code (pgSQL)
1 Domains, Types part 1
2 Predicates, Constraints, Relations part 2
3 Propositions and Facts part 3

Relation Values

To populate tables simply run the code; it is self explanatory. You may want to keep an eye on the model while reading the code.

Propositions and Facts

A proposition is a declarative sentence, which can be evaluated (flagged) as true or false. A fact is a proposition considered to be true.

Starting with an example:

SELECT * from board;

SQUARE  CL   RW
---------------
  a1    a    1
  a2    a    2
  ... 
  h8    h    8

consider the result in conjunction with the predicate: [p_03] the board square named (SQUARE) is located in the row labelled (RW) and the column labelled (CL).

Substituting predicate variables with values from the query results in a set of facts, a set of declarative sentences considered to be true.

The board square named 'a1' is located in the row labelled 'a' and the column labelled '1'.

The board square named 'a2' is located in the row labelled 'a' and the column labelled '2'.

...

The board square named 'h8' is located in the row labelled 'h' and the column labelled '8'.

Each of these sentences can be seen as as a term that evaluates to true, hence it is possible to look at the relation board and the matching predicate [p_03] as a function:

board::(SQUARE, RW, CL) → TRUE

-- or more general

board::(SQUARE, RW, CL) → Boolean

So, board is a Boolean function with three parameters: SQUARE, RW, and CL.

Staring with the function board and knowing keys of the relation ({SQUARE}, {RW, CL}) it is easy to derive few related functions.

board::(SQUARE, RW, CL) → TRUE


f1:: SQUARE → (RW, CL) -- returns pair

f2::(RW, CL) → SQUARE

f3:: RW → {SQUARE} -- returns set

f4:: CL → {SQUARE} -- returns set

It takes just a bit of mental exercise to see things this way, and I would encourage you to practice a bit. Once and for all, the exercise will liberate you from the association of the term “relational” with “something to do with a square-shaped data-storage“, which happens to be the popular opinion.

Towards the Solution

It it easy to see how this thinking can be applied to the relation square_line and the matching predicate [p_05].

square_line::(SQUARE, ANCHOR, MOV_DIR) → TRUE

g1:: SQUARE  → {(ANCHOR, MOV_DIR)}

g2::{SQUARE} → {(ANCHOR, MOV_DIR)}

g3:: (ANCHOR, MOV_DIR)  → {SQUARE}

g4::{(ANCHOR, MOV_DIR)} → {SQUARE}

Look at the board and consider the query:

select * 
from square_line
where square in ('c5', 'f2');

square   anchor    mov_dir
---------------------------
  c5       a3      RGHT-UP
  c5       a5      RGHT
  c5       a7      RGHT-DN
  c5       c1      UP
  f2       a2      RGHT
  f2       a7      RGHT-DN
  f2       e1      RGHT-UP
  f2       f1      UP

Given the predicate [p_05], the result is a set of facts:

The square named 'c5' is located on the line identified by the starting square 'a3' and the direction of movement 'RGHT-UP'.

The square named 'c5' is located on the line identified by the starting square 'a5' and the direction of movement 'RGHT'.

The square named 'c5' is located on the line identified by the starting square 'a7' and the direction of movement 'RGHT-DN'.

The square named 'c5' is located on the line identified by the starting square 'c1' and the direction of movement 'UP'.

The square named 'f2' is located on the line identified by the starting square 'a2' and the direction of movement 'RGHT'.

The square named 'f2' is located on the line identified by the starting square 'a7' and the direction of movement 'RGHT-DN'.

The square named 'f2' is located on the line identified by the starting square 'e1' and the direction of movement 'RGHT-UP'.

The square named 'f2' is located on the line identified by the starting square 'f1' and the direction of movement 'UP'.

Suppose two queens are placed on the board, on squares ‘c5’ and ‘f2’:

select anchor, mov_dir
from square_line
where square = 'c5'

intersect

select anchor, mov_dir
from square_line
where square = 'f2'
;

anchor  mov_dir
---------------------------
  a7    RGHT-DN

This fact may be verbalised as:

The queens are located on the line identified by the starting square 'a7' and the direction of movement 'RGHT-DN'.

Now change ‘f2’ to ‘g2’ and run the query.
Next time: the algorithm and the solution to the puzzle.

Eight Queens in SQL [2]

This is the second article in the series; the previous one introduced the problem and specified domains.

Links to all previous and this post’s code.
Part Main Topic Code (pgSQL)
1 Domains, Types part 1
2 Predicates, Constraints, Relations part 2

The Essence

The idea is to describe the problem using natural language and concepts of domains, predicates, constraints, relations and keys; here are a few simple rules to keep in mind.

From predicates and constraints to relations and keys
  • A predicate variable — from a specific domain — maps to an attribute of the relation.
  • Internal (to predicate) uniqueness constraints map to keys of the relation.
  • External (to predicate) inclusion constraints map to foreign keys between two relations.
  • A predicate and the matching relation represent — evaluate to — a set of facts about the universe of discourse (the problem).
  • A predicate and the matching relation should not be separated, otherwise information will be lost, for all practical purposes.
For SQL implementation
  • Relations map to tables.
  • Derived relations map to views.
  • Attributes map to columns.
  • Domains map to data types.
  • A key maps to a primary key or an alternate key (unique constraint).
  • A foreign key maps to a foreign key.
  • Predicates and constraints are added as comments on matching database objects (tables, keys).

Predicates, Constraints, Relations

[p_01] The column labelled (CL) with the assigned column number (CL_NO) exists.

(c1.1) Each column is assigned exactly one column number.
(c1.2) Each column number is assigned to exactly one column.

a_col { CL     col_lbl      -- p_01
      , CL_NO  ordinal_num
      }

  KEY {CL}                  -- c1.1

  KEY {CL_NO}               -- c1.2

 
[p_02] The row labelled (RW) with the assigned row number (RW_NO) exists.

(c2.1) Each row is assigned exactly one row number.
(c2.2) Each row number is assigned to exactly one row.

a_row { RW     row_lbl      -- p_02
      , RW_NO  ordinal_num
      }

  KEY {RW}                  -- c2.1

  KEY {RW_NO}               -- c2.2

 
[p_03] The board square named (SQUARE) is located in the row labelled (RW) and the column labelled (CL).

(c3.1) Each square is located in exactly one row; for each row it is possible that more than one square is located in that row.
(c3.2) Each square is located in exactly one column; for each column it is possible that more than one square is located in that column.
(c3.3) For each row and column, exactly one square is located in both that row and that column.
(c3.4) If a square is located in a row then that row must exist.
(c3.5) If a square is located in a column then that column must exist.

board { SQUARE  square_name   -- p_03
      , CL      col_lbl
      , RW      row_lbl
      }

  KEY {SQUARE}          -- c3.1, c3.2

  KEY {CL, RW}          -- c3.3

FOREIGN KEY {RW}        -- c3.4
 REFERENCES a_row {RW}

FOREIGN KEY {CL}        -- c3.5
 REFERENCES a_col {CL}

 
[p_04] The line identified by the starting square (ANCHOR) and the direction of movement (MOV_DIR) exists.

(c4.1) Each line is identified by exactly one starting square and direction of movement combination.
(c4.2) If a starting square is part of a line identifier, then that square must exist.

line { ANCHOR   square_name  -- p_04
     , MOV_DIR  mov_dir
     }

 KEY {ANCHOR, MOV_DIR}       -- c4.1

FOREIGN KEY {ANCHOR}         -- c4.2
 REFERENCES board {SQUARE}

 
[p_05] The square named (SQUARE) is located on the line identified by the starting square (ANCHOR) and the direction of movement (MOV_DIR).

(c5.1) For each square, that square may be located on more than one line.
(c5.2) For each line, that line may contain more than one square.
(c5.3) For each square and line, that square and line combination occurs at most once.
(c5.4) If a square is located on a line then that square must exist.
(c5.5) If a square is located on a line then that line must exist.

square_line { SQUARE   square_name    -- p_05
            , ANCHOR   square_name
            , MOV_DIR  mov_dir
            }

KEY {SQUARE, ANCHOR, MOV_DIR} -- c5.3, c5.2, c5.1

FOREIGN KEY       {SQUARE}            -- c5.4
 REFERENCES board {SQUARE}

FOREIGN KEY      {ANCHOR, MOV_DIR}    -- c5.5
 REFERENCES line {ANCHOR, MOV_DIR}

 

For convenience I will add a relation that allows me to access a square by name, row and column labels, or row and column numbers. This can be derived from the existing relations: board, a_col, a_row.

[p_06] The board square named (SQUARE) is located in the row labelled (RW), row number (RW_NO); and the column labelled (CL), column number (CL_NO).

-- derived relation
--
board1 { SQUARE  square_name   -- p_06
       , CL      col_lbl
       , RW      row_lbl
       , CL_NO   ordinal_num
       , RW_NO   ordinal_num
       }

  KEY {SQUARE}
  KEY {CL, RW}
  KEY {CL_NO, RW_NO}


-- derive using 
--

board1 =    {SQUARE, CL, RW} 
       JOIN {CL, CL_NO} 
       JOIN {RW, RW_NO}

board1 = JOIN {board, a_col, a_row}

 

There is one more predicate and relation to consider: the actual solution. At this point we do know that eight queens have to be placed on eight squares, and that there may be more than one solution. There will be some kind of calculation (algorithm) involved so the final relation will be derived from others.

[p_07] Placing queens on squares (S1), (S2), (S3), (S4), (S5), (S6), (S7), and (S8) is a solution to the puzzle.

(c7.1) A solution is identified by combination of squares S1 … S8.

-- derived relation
--
solution { S1 square_name             -- p_07
         , S2 square_name
         , S3 square_name
         , S4 square_name
         , S5 square_name
         , S6 square_name
         , S7 square_name
         , S8 square_name
         }

KEY {S1, S2, S3, S4, S5, S6, S7, S8}  -- c7.1

SQL Implementation

Creating an entity-relationship model is purely optional, but it does help.

Again, the translation from the reasoning to the SQL code is straightforward, here is the part 2 of the code.

What we have now are essentially empty relational variables. The relational variables a_col, a_row, board, line, and square_line need actual relational values to represent sets of facts about the chessboard.

Next time: relation values, propositions, and facts.

Eight Queens in SQL [1]

The puzzle is to place eight queens on a chessboard so that no two queens attack each other. The problem is old and has been well studied, the Rosetta Code has solutions for more than 80 programming languages, SQL included.

However, the Rosetta’s SQL solution is implementing a recursive algorithm in SQL, as opposed to a relational approach. In this article I will use a relational design approach to the problem, and then translate that to SQL. The main idea is to design using natural language and concepts of domains, predicates, constraints, sets, set operations, and relations..

The logical design process is database agnostic; for the initial code samples I will use PostgreSQL, which can later be translated to other SQL dialects.

Domains

A domain is simply a set of all possible values over which a variable of interest may range.

The board is composed of 64 named squares, organised in eight rows and eight columns; rows and columns are labelled.

square_name = {'a1' .. 'h8'} -- set of square names
row_lbl     = {'1'  .. '8'}  -- set of row labels
col_lbl     = {'a'  .. 'h'}  -- set of column labels

chessboard

It is convenient to refer to a square by an offset from another one (say +/-1), so I will assign ordinal numbers to rows and columns.

ordinal_num = {1 .. 8} -- set of ordinal numbers
                       -- for rows and columns.

Two queens attack each other if they are placed on the same row, column or diagonal. A queen moves in straight lines; so I will use a generic term line to designate a row, a column or a diagonal.

A line (row, column, or diagonal) is a set of squares. To solve the puzzle two queens can not be placed on a square belonging to the same line.

In order to identify a line I will use a starting square (anchor) and direction of motion. For example column b can be identified with an anchor b1 and direction UP; diagonal a2:g8 would have anchor a2 and direction right-up. So, this introduces set of anchors and set of movement directions.

mov_dir = {'UP', 'RGHT', 'RGHT-UP', 'RGHT-DN'}

An anchor is a square identified by its name, hence a set of anchor names is a proper subset of the set of square names; more about anchors a bit later.

SQL Implementation

The reasoning about domains can be nicely translated to PostgreSQL using types; here is the part 1 of the code. The translation from the reasoning to the code is straightforward, which was the idea in the first place.

Next time: predicates, constraints, relations.

PostgreSQL, Send More Money

It is often forgotten that the original idea which led to the development of SQL is the relational model and that it stems from (first-order) logic. The idea is based on concepts of predicates, facts, constraints, sets, set operations, logical operators, relations, relational operators, etc.

Hence, most problems that can be reasoned about in these terms can be easily implemented in SQL. The implementation from the reasoning to the code is often straightforward.

Consider a paradigm of constraint programming over finite domains: the common example for this is the puzzle SEND + MORE = MONEY where each letter represents a different digit.

   SEND
+  MORE
= MONEY

given S <> 0, M <> 0

It is fairly easy to reason about this problem in terms of sets, set operations, constraints, predicates, and logical operations; hence the translation from the reasoning to the SQL code should be straightforward.

Reasoning

The domain for each digit is {0 … 9}; so let’s start with a set named Digit and eight aliases for that same set. I will treat each digit in the result as an element of the corresponding alias set, essentially they are all elements of the Digit set too.

Digit = {0 ... 9}

St = Et = Nt = Dt = Mt = Ot = Rt = Yt = Digit

S ∈ St
E ∈ Et
N ∈ Nt
D ∈ Dt
M ∈ Mt
O ∈ Ot
R ∈ Rt
Y ∈ Yt

The search space is the Cartesian product of these sets — a finite set of tuples — and the result set is a subset of this.

Srch = St X Et X Nt X Dt X Mt X Ot X Rt X Yt

Srch = {  (0, 0, 0, 0, 0, 0, 0, 0)
        ...
        , (9, 9, 9, 9, 9, 9, 9 ,9)
       }

Result ⊆ Srch

Given that the search set is finite, it is possible to apply constraints to each tuple in the set and keep only tuples that satisfy these constraints.

The original set of constraints, as per the puzzle problem, can be written as:

M > 0

S > 0

  D + N * 10 + E * 100 + S * 1000
+ E + R * 10 + O * 100 + M * 1000
= Y + E * 10 + N * 100 + O * 1000 + M * 10000

all_different (S,E,N,D,M,O,R,Y)

The last — all_different — constraint seems obvious, but is needed because the Cartesian product repeats digits within a tuple.

The next step is to check if it is possible to reduce the search space by reasoning about the problem.

Reducing the Search Space

Given that the result has more digits than both operands and that the maximum sum of two single digit numbers is 18, the maximum carry is 1; therefore the first digit of the result must be 1, that is, M = 1.

Given that M = 1 and there exists a carry in (S + M), S must be at least 8, that is, S >= 8.

From the first column, (D + E) may or may not have a carry, but in both cases it holds that (D+E) modulo 10 = Y.

Step Reasoning Note Search Space
1 Cartesian Product 10^8 1E+08 = 100M
2 M=1 also step 1 1E+07 = 10M
3 S >= 8 also steps 1,2 2E+06 = 2M
4 (D+E) mod 10 = Y also steps 1,2,3 2E+05 = 200K

So, by simple reasoning it was possible to reduce the search space by 500 times. Required memory space (PostgreSQL) for this data set may be estimated to (8*4 + 24) * 200K bytes, approximately 11.5 MB — which should fit into a memory of a modest laptop.

To summarize all constraints:

M = 1  -- implies M > 0

S >= 8 -- implies S > 0

(D + E) mod 10 = Y

  D + N * 10 + E * 100 + S * 1000
+ E + R * 10 + O * 100 + M * 1000
= Y + E * 10 + N * 100 + O * 1000 + M * 10000

all_different (S,E,N,D,M,O,R,Y)

SQL Implementation

Translation to the SQL code is now straightforward — you should be able to read the code and the reasoning section in parallel.

  • The all_different constraint is implemented as a separate Boolean function which takes an array of integers for an input.
  • Constraints are essentially Boolean expressions and are placed in the WHERE clause of the query.
  • The Digit domain (set) is represented by a relation with one attribute (single column).
  • The cross product of sets St … Yt translates to CROSS JOINs in the FROM clause of the query.
  • The final result is packaged in a view named Send_MM.
  • SELECT * FROM Send_MM;
    
    s   e   n   d   m   o   r   y
    ------------------------------
    9   5   6   7   1   0   8   2
    
       9567
    +  1085
    = 10652
    

    On my machine this takes 160 milliseconds, should run fine even on very modest hardware.

    Summary

    To summarize, the idea of this exercise is to show that the relational model (and SQL) is much more than just “tabular data storage”, as often perceived.

    If a problem can be reasoned about in a natural language, using concepts like sets, set operations, domains, predicates, constraints, relations, logical operations, etc., then translation to a relational model and the SQL is fairly simple.