Contradiction in Database [2]

The previous post focused on a contradiction in general; in this one I present a specific example using PostgreSQL.

Example

Starting with a predicate:
[p_1] Country named (CNAME) is assigned country code (CC2) and number (CID).

Country {CID, CC2, CNAME}
    KEY {CID}

Country (CID, CC2,  CNAME   )
      { (1,  'CA', 'Canada' )
      , (2,  'FR', 'France' )
      , (3,  'DE', 'Germany')
      , (4,  'DE', 'Denmark') -- typo
      }

Take a look at the sketch and the DDL. This is a common find: the ID is used as a primary key and there are no other constraints.

Consider the tuple with CID = 4; it has a wrong country code for Denmark, which results in a contradiction. From tuples with CID = 3 and CID = 4 it follows:

  • Country code DE is assigned to country named Germany.
  • Country code DE is assigned to country named Denmark.

In this case the contradiction is a result of the typo and Continue reading “Contradiction in Database [2]”

Contradiction in Database [1]

Suppose you have a database, of any type: SQL, NoSQL, not-yet-SQL, a spreadsheet, a bunch of CSV files, or a shoebox full of post-it notes.

I would argue that the database represents a bag (multiset) of facts; even if it does not look like that, it can be verbalized as such.

Let’s name this bag of facts: Γ.

Consider a case when two of these facts contradict each other.
Suppose that in Γ we find a fact: today is Monday; then another fact that states: today is Friday. Let’s see if I can formalize this:

A = Today is Monday.
B = Today is Friday.


        B ⊢ ¬A   Γ ⊢ B
        ----------------- (e1)
Γ ⊢ A   Γ ⊢ ¬A
------------------------- (i1)
Γ ⊢ (A ∧ ¬A)

Continue reading “Contradiction in Database [1]”

Types and Typos in SQL [2]

In the previous post I demonstrated how small typos in SQL become large logical errors, and how a “proper” type system could help in preventing these. This time I explore how PostgreSQL’s enum data type can be leveraged to gain some data safety and prevent logical errors.

PostgreSQL’s ENUM

In PostgreSQL the enum data type offers some type safety, for example:

CREATE TYPE d1 AS ENUM
('1', '2', '3');

CREATE TYPE d2 AS ENUM
('1', '2', '3');

Enumerated values are labels, hence the quotation marks. Continue reading “Types and Typos in SQL [2]”

Types and Typos in SQL [1]

Almost anyone who has spent time working with SQL made — or had to fix — this kind of bug:

-- two tables (sketch)
--
users {id, user_name}

trans {id, amt, cur, tx_time, user_id}
-- query with a typo
--
SELECT user_name
     , tx_time
     , amt
     , cur
 FROM users as u
 JOIN trans as t ON u.id = t.id -- <- typo !!
WHERE user_name = 'jack' ;

Try the query, here is the DDL with some data; note the wrong result. Continue reading "Types and Typos in SQL [1]"