Eight Queens in SQL [4]

This is the fourth and final part of the series showing how to solve the “Eight Queens” chess puzzle, using a database design and SQL. The main idea is to design using natural language and concepts of domains, predicates, constraints, sets, set operations, and relations.

Links to all previous and this post’s code.
Part Main Topic Code (pgSQL)
1 Domains, Types part 1
2 Predicates, Constraints, Relations part 2
3 Propositions and Facts part 3
4 The Algorithm and the Solution part 4

Algorithm

Say we place a queen, Q1, on square c5, the queen will cover row 5, column c, and diagonals a3:f8 and a7:g1. Each one of these is a line (a set of squares), so the square c5 is an element of each of these four sets. Using the (anchor, movement direction) notation to identify a line, the following holds:

     row 5     = ('a5', 'RGHT')    = {'a5' ... 'h5'}
  column c     = ('c1', 'UP')      = {'c1' ... 'c8'}
diagonal a3:f8 = ('a3', 'RGHT-UP') = {'a3' ... 'f8'}
diagonal a7:g1 = ('a7', 'RGHT-DN') = {'a7' ... 'g1'}

'c5' ∈ ('a5', 'RGHT') 
'c5' ∈ ('c1', 'UP')
'c5' ∈ ('a3', 'RGHT-UP')
'c5' ∈ ('a7', 'RGHT-DN')

Let’s introduce a set, L1, which is the set of these four lines; note that L1 is a set of sets.

L1 = { ('a5', 'RGHT')   , ('c1', 'UP'),
       ('a3', 'RGHT-UP'), ('a7', 'RGHT-DN')
     }

'c5' ∈ ⋃ L1

⋂ L1 = {'c5'}

⋃ L1 is the union of all sets in L1, and represents all squares that the queen placed on c5 attacks. ⋂ L1 is the intersection of these sets and it has only one element, the c5 square itself.

Now let’s place another queen, Q2, on square f2; using the reasoning from the previous example, the following holds:

L2 = { ('a2', 'RGHT')   , ('f1', 'UP'),
       ('e1', 'RGHT-UP'), ('a7', 'RGHT-DN')
     }

'f2' ∈ ⋃ L2

⋂ L2 = {'f2'}

The rule that two queens attack each other if placed on the same line (row, column or diagonal) can be expressed as an intersection between sets L1 and L2. If the intersection is an empty set then they do not attack each other, otherwise they do; in this example they both have one diagonal in common.

L1 ⋂ L2 = {('a7', 'RGHT-DN')}

Suppose that eight queens Q1 … Q8 are placed on a chessboard one by one. As queen Qn is placed on a square we calculate the corresponding set Ln. The intersection of set Ln — for the queen being placed — and union of all previous L1 … Ln-1 sets — for queens already on the board — must be an empty set.

Queen  Rule for placement

Q1     L1 ⋂  {} = {}  -- always True
Q2     L2 ⋂  L1 = {}
Q3     L3 ⋂ (L1 ⋃ L2) = {}
Q4     L4 ⋂ (L1 ⋃ L2 ⋃ L3) = {}
Q5     L5 ⋂ (L1 ⋃ L2 ⋃ L3 ⋃ L4) = {}
Q6     L6 ⋂ (L1 ⋃ L2 ⋃ L3 ⋃ L4 ⋃ L5) = {}
Q7     L7 ⋂ (L1 ⋃ L2 ⋃ L3 ⋃ L4 ⋃ L5 ⋃ L6) = {}
Q8     L8 ⋂ (L1 ⋃ L2 ⋃ L3 ⋃ L4 ⋃ L5 ⋃ L6 ⋃ L7) = {}

In general, to place queen Qn on the board, the following must hold:

⋃{L1 ... Ln-1} ⋂ Ln = {}

This now looks like an algorithm; although it does appear recursive, I will avoid recursion in this example.

Reducing the Search Space

A solution to the problem is in the form of:

 
{s1, s2, s3, s4, s5, s6, s7, s8}

sn ∈ square_name

where sn is a square identified by a square name.

In general, there is more than one solution to the puzzle, hence the
final result is a set of all possible solutions:

 
{
  {s1.1, s1.2, s1.3, s1.4, s1.5, s1.6, s1.7, s1.8}
, {s2.1, s2.2, s2.3, s2.4, s2.5, s2.6, s2.7, s2.8}
, ...
}

si.n ∈ square_name

The idea is now to estimate the search space for which the placement rules have to be evaluated, and reduce it as much as possible by reasoning about the problem.

In the first estimate we can start with reasoning that the first queen can be placed on one out of 64 squares, second on one out of 63, third on one out of 62, etc. The number of permutations here is 64! / (64-8)! = 1.78E+14.

However, the order of queens within one solution does not matter, so we should use combinations instead and choose 8 out of 64 — (64! / (8! x (64-8)!) = 4.43E+9.

Given that two queens can not be placed in the same column, we can start placing them column by column, so each queen can be placed on a maximum of eight squares — one column. This brings down the number of combinations to 8^8 = 1.68E+7.

Given that two queens can not be in the same row, each subsequent placement has one square less available than the previous one; this now brings number of combinations to 8! = 40320.

Further improvement can be made avoiding rows +/-1 from the queen in the previous column. For example, if the first queen was placed in column ‘a’, row 4, then the second queen can be placed in column ‘b’, but not in rows 3, 4 or 5; this further reduces the number of possible combinations to 5424.

To summarize the reasoning about number of combinations involved:

Reasoning Calc Search Space
Permute 8 out of 64 64! / (64-8)! 1.78E+14
Choose 8 out of 64 64! / (8! x (64-8)!) 4.43E+9
By column 8^8 1.68E+7
By column, avoid previous rows 8! 40320
By column, avoid previous rows,
avoid row +/-1 from the queen
in the previous column.
Simply count 5424

This looks quite feasible; in total the placement rules argued in the algorithm section have to be evaluated for 5424 combinations.

Solution

Take a look at the code; the result is implemented as a single query in a view named solution.

There are 92 possible solutions to the puzzle:

select * from solution;


s1  s2  s3  s4  s5  s6  s7  s8
------------------------------
a1  b5  c8  d6  e3  f7  g2  h4
a1  b6  c8  d3  e7  f4  g2  h5
...
a8  b3  c1  d6  e2  f5  g7  h4
a8  b4  c1  d3  e6  f2  g7  h5

It should be easy to read the code and the reasoning section in parallel, which has been the main guiding principle of this exercise.

The first part of the query — CTE named w — reduces the search space as described in the previous section. Consider the step of placing queen Q3 on column c:

JOIN board1 as q3 
            ON q3.CL  = 'c' 
           and q3.RW not in (q1.RW, q2.rw) 
           and q3.RW_NO not BETWEEN (q2.RW_NO - 1)
                                and (q2.RW_NO + 1)

The predicate in the join condition is selecting the column c, avoiding the rows of the two previously placed queens, and avoiding rows +/- 1 from the queen next to it (Q2).

The CTE w returns a set of tuples — the reduced search space — then the second part of the query applies the rules (constraints) from the algorithm section to this set.

Using queen Q3 as an example, the rule for placement:

L3 ⋂ (L1 ⋃ L2) = {}

is implemented as:

and not exists ( -- placing queen 3

      SELECT distinct e.ANCHOR, e.MOV_DIR 
        FROM square_line as e 
        WHERE e.SQUARE in (w.s1, w.s2)

      INTERSECT

      SELECT distinct e.ANCHOR, e.MOV_DIR 
        FROM square_line as e 
       WHERE e.SQUARE = w.s3
    )

Evaluating the Solution

When evaluating a solution to a problem, I tend to focus on the following criteria:

  1. Correctness.
  2. Ability to reason about the code and the problem now.
  3. Ability to reason about the code and the problem in the future. Suppose that this happens to be a part of a much larger project, and ten years from now a “third generation” of developers has to understand the code. How hard will it be?
  4. Ability to change technology and scale in size. Instead of arguing that this approach can easily scale and change technologies, I will simply present it in a few future posts.
  5. Performance: although I did not even attempt to optimize queries — and have given priority to readability and reasoning — the solution runs reasonably well.

I encourage you to compare this to Rosetta’s implementation — and a few more — using all of the criteria.

Eight Queens in SQL [3]

The third article in the series.

Links to all previous and this post’s code.
Part Main Topic Code (pgSQL)
1 Domains, Types part 1
2 Predicates, Constraints, Relations part 2
3 Propositions and Facts part 3

Relation Values

To populate tables simply run the code; it is self explanatory. You may want to keep an eye on the model while reading the code.

Propositions and Facts

A proposition is a declarative sentence, which can be evaluated (flagged) as true or false. A fact is a proposition considered to be true.

Starting with an example:

SELECT * from board;

SQUARE  CL   RW
---------------
  a1    a    1
  a2    a    2
  ... 
  h8    h    8

consider the result in conjunction with the predicate: [p_03] the board square named (SQUARE) is located in the row labelled (RW) and the column labelled (CL).

Substituting predicate variables with values from the query results in a set of facts, a set of declarative sentences considered to be true.

The board square named 'a1' is located in the row labelled 'a' and the column labelled '1'.

The board square named 'a2' is located in the row labelled 'a' and the column labelled '2'.

...

The board square named 'h8' is located in the row labelled 'h' and the column labelled '8'.

Each of these sentences can be seen as as a term that evaluates to true, hence it is possible to look at the relation board and the matching predicate [p_03] as a function:

board::(SQUARE, RW, CL) → TRUE

-- or more general

board::(SQUARE, RW, CL) → Boolean

So, board is a Boolean function with three parameters: SQUARE, RW, and CL.

Staring with the function board and knowing keys of the relation ({SQUARE}, {RW, CL}) it is easy to derive few related functions.

board::(SQUARE, RW, CL) → TRUE


f1:: SQUARE → (RW, CL) -- returns pair

f2::(RW, CL) → SQUARE

f3:: RW → {SQUARE} -- returns set

f4:: CL → {SQUARE} -- returns set

It takes just a bit of mental exercise to see things this way, and I would encourage you to practice a bit. Once and for all, the exercise will liberate you from the association of the term “relational” with “something to do with a square-shaped data-storage“, which happens to be the popular opinion.

Towards the Solution

It it easy to see how this thinking can be applied to the relation square_line and the matching predicate [p_05].

square_line::(SQUARE, ANCHOR, MOV_DIR) → TRUE

g1:: SQUARE  → {(ANCHOR, MOV_DIR)}

g2::{SQUARE} → {(ANCHOR, MOV_DIR)}

g3:: (ANCHOR, MOV_DIR)  → {SQUARE}

g4::{(ANCHOR, MOV_DIR)} → {SQUARE}

Look at the board and consider the query:

select * 
from square_line
where square in ('c5', 'f2');

square   anchor    mov_dir
---------------------------
  c5       a3      RGHT-UP
  c5       a5      RGHT
  c5       a7      RGHT-DN
  c5       c1      UP
  f2       a2      RGHT
  f2       a7      RGHT-DN
  f2       e1      RGHT-UP
  f2       f1      UP

Given the predicate [p_05], the result is a set of facts:

The square named 'c5' is located on the line identified by the starting square 'a3' and the direction of movement 'RGHT-UP'.

The square named 'c5' is located on the line identified by the starting square 'a5' and the direction of movement 'RGHT'.

The square named 'c5' is located on the line identified by the starting square 'a7' and the direction of movement 'RGHT-DN'.

The square named 'c5' is located on the line identified by the starting square 'c1' and the direction of movement 'UP'.

The square named 'f2' is located on the line identified by the starting square 'a2' and the direction of movement 'RGHT'.

The square named 'f2' is located on the line identified by the starting square 'a7' and the direction of movement 'RGHT-DN'.

The square named 'f2' is located on the line identified by the starting square 'e1' and the direction of movement 'RGHT-UP'.

The square named 'f2' is located on the line identified by the starting square 'f1' and the direction of movement 'UP'.

Suppose two queens are placed on the board, on squares ‘c5’ and ‘f2’:

select anchor, mov_dir
from square_line
where square = 'c5'

intersect

select anchor, mov_dir
from square_line
where square = 'f2'
;

anchor  mov_dir
---------------------------
  a7    RGHT-DN

This fact may be verbalised as:

The queens are located on the line identified by the starting square 'a7' and the direction of movement 'RGHT-DN'.

Now change ‘f2’ to ‘g2’ and run the query.
Next time: the algorithm and the solution to the puzzle.

Eight Queens in SQL [2]

This is the second article in the series; the previous one introduced the problem and specified domains.

Links to all previous and this post’s code.
Part Main Topic Code (pgSQL)
1 Domains, Types part 1
2 Predicates, Constraints, Relations part 2

The Essence

The idea is to describe the problem using natural language and concepts of domains, predicates, constraints, relations and keys; here are a few simple rules to keep in mind.

From predicates and constraints to relations and keys
  • A predicate variable — from a specific domain — maps to an attribute of the relation.
  • Internal (to predicate) uniqueness constraints map to keys of the relation.
  • External (to predicate) inclusion constraints map to foreign keys between two relations.
  • A predicate and the matching relation represent — evaluate to — a set of facts about the universe of discourse (the problem).
  • A predicate and the matching relation should not be separated, otherwise information will be lost, for all practical purposes.
For SQL implementation
  • Relations map to tables.
  • Derived relations map to views.
  • Attributes map to columns.
  • Domains map to data types.
  • A key maps to a primary key or an alternate key (unique constraint).
  • A foreign key maps to a foreign key.
  • Predicates and constraints are added as comments on matching database objects (tables, keys).

Predicates, Constraints, Relations

[p_01] The column labelled (CL) with the assigned column number (CL_NO) exists.

(c1.1) Each column is assigned exactly one column number.
(c1.2) Each column number is assigned to exactly one column.

a_col { CL     col_lbl      -- p_01
      , CL_NO  ordinal_num
      }

  KEY {CL}                  -- c1.1

  KEY {CL_NO}               -- c1.2

 
[p_02] The row labelled (RW) with the assigned row number (RW_NO) exists.

(c2.1) Each row is assigned exactly one row number.
(c2.2) Each row number is assigned to exactly one row.

a_row { RW     row_lbl      -- p_02
      , RW_NO  ordinal_num
      }

  KEY {RW}                  -- c2.1

  KEY {RW_NO}               -- c2.2

 
[p_03] The board square named (SQUARE) is located in the row labelled (RW) and the column labelled (CL).

(c3.1) Each square is located in exactly one row; for each row it is possible that more than one square is located in that row.
(c3.2) Each square is located in exactly one column; for each column it is possible that more than one square is located in that column.
(c3.3) For each row and column, exactly one square is located in both that row and that column.
(c3.4) If a square is located in a row then that row must exist.
(c3.5) If a square is located in a column then that column must exist.

board { SQUARE  square_name   -- p_03
      , CL      col_lbl
      , RW      row_lbl
      }

  KEY {SQUARE}          -- c3.1, c3.2

  KEY {CL, RW}          -- c3.3

FOREIGN KEY {RW}        -- c3.4
 REFERENCES a_row {RW}

FOREIGN KEY {CL}        -- c3.5
 REFERENCES a_col {CL}

 
[p_04] The line identified by the starting square (ANCHOR) and the direction of movement (MOV_DIR) exists.

(c4.1) Each line is identified by exactly one starting square and direction of movement combination.
(c4.2) If a starting square is part of a line identifier, then that square must exist.

line { ANCHOR   square_name  -- p_04
     , MOV_DIR  mov_dir
     }

 KEY {ANCHOR, MOV_DIR}       -- c4.1

FOREIGN KEY {ANCHOR}         -- c4.2
 REFERENCES board {SQUARE}

 
[p_05] The square named (SQUARE) is located on the line identified by the starting square (ANCHOR) and the direction of movement (MOV_DIR).

(c5.1) For each square, that square may be located on more than one line.
(c5.2) For each line, that line may contain more than one square.
(c5.3) For each square and line, that square and line combination occurs at most once.
(c5.4) If a square is located on a line then that square must exist.
(c5.5) If a square is located on a line then that line must exist.

square_line { SQUARE   square_name    -- p_05
            , ANCHOR   square_name
            , MOV_DIR  mov_dir
            }

KEY {SQUARE, ANCHOR, MOV_DIR} -- c5.3, c5.2, c5.1

FOREIGN KEY       {SQUARE}            -- c5.4
 REFERENCES board {SQUARE}

FOREIGN KEY      {ANCHOR, MOV_DIR}    -- c5.5
 REFERENCES line {ANCHOR, MOV_DIR}

 

For convenience I will add a relation that allows me to access a square by name, row and column labels, or row and column numbers. This can be derived from the existing relations: board, a_col, a_row.

[p_06] The board square named (SQUARE) is located in the row labelled (RW), row number (RW_NO); and the column labelled (CL), column number (CL_NO).

-- derived relation
--
board1 { SQUARE  square_name   -- p_06
       , CL      col_lbl
       , RW      row_lbl
       , CL_NO   ordinal_num
       , RW_NO   ordinal_num
       }

  KEY {SQUARE}
  KEY {CL, RW}
  KEY {CL_NO, RW_NO}


-- derive using 
--

board1 =    {SQUARE, CL, RW} 
       JOIN {CL, CL_NO} 
       JOIN {RW, RW_NO}

board1 = JOIN {board, a_col, a_row}

 

There is one more predicate and relation to consider: the actual solution. At this point we do know that eight queens have to be placed on eight squares, and that there may be more than one solution. There will be some kind of calculation (algorithm) involved so the final relation will be derived from others.

[p_07] Placing queens on squares (S1), (S2), (S3), (S4), (S5), (S6), (S7), and (S8) is a solution to the puzzle.

(c7.1) A solution is identified by combination of squares S1 … S8.

-- derived relation
--
solution { S1 square_name             -- p_07
         , S2 square_name
         , S3 square_name
         , S4 square_name
         , S5 square_name
         , S6 square_name
         , S7 square_name
         , S8 square_name
         }

KEY {S1, S2, S3, S4, S5, S6, S7, S8}  -- c7.1

SQL Implementation

Creating an entity-relationship model is purely optional, but it does help.

Again, the translation from the reasoning to the SQL code is straightforward, here is the part 2 of the code.

What we have now are essentially empty relational variables. The relational variables a_col, a_row, board, line, and square_line need actual relational values to represent sets of facts about the chessboard.

Next time: relation values, propositions, and facts.

Eight Queens in SQL [1]

The puzzle is to place eight queens on a chessboard so that no two queens attack each other. The problem is old and has been well studied, the Rosetta Code has solutions for more than 80 programming languages, SQL included.

However, the Rosetta’s SQL solution is implementing a recursive algorithm in SQL, as opposed to a relational approach. In this article I will use a relational design approach to the problem, and then translate that to SQL. The main idea is to design using natural language and concepts of domains, predicates, constraints, sets, set operations, and relations.

The logical design process is database agnostic; for the initial code samples I will use PostgreSQL, which can later be translated to other SQL dialects.

Domains

A domain is simply a set of all possible values over which a variable of interest may range.

The board is composed of 64 named squares, organised in eight rows and eight columns; rows and columns are labelled.

square_name = {'a1' .. 'h8'} -- set of square names
row_lbl     = {'1'  .. '8'}  -- set of row labels
col_lbl     = {'a'  .. 'h'}  -- set of column labels

chessboard

It is convenient to refer to a square by an offset from another one (say +/-1), so I will assign ordinal numbers to rows and columns.

ordinal_num = {1 .. 8} -- set of ordinal numbers
                       -- for rows and columns.

Two queens attack each other if they are placed on the same row, column or diagonal. A queen moves in straight lines; so I will use a generic term line to designate a row, a column or a diagonal.

A line (row, column, or diagonal) is a set of squares. To solve the puzzle two queens can not be placed on a square belonging to the same line.

In order to identify a line I will use a starting square (anchor) and direction of motion. For example column b can be identified with an anchor b1 and direction UP; diagonal a2:g8 would have anchor a2 and direction right-up. So, this introduces set of anchors and set of movement directions.

mov_dir = {'UP', 'RGHT', 'RGHT-UP', 'RGHT-DN'}

An anchor is a square identified by its name, hence a set of anchor names is a proper subset of the set of square names; more about anchors a bit later.

SQL Implementation

The reasoning about domains can be nicely translated to PostgreSQL using types; here is the part 1 of the code. The translation from the reasoning to the code is straightforward, which was the idea in the first place.

Next time: predicates, constraints, relations.