Reverse Engineer a DB [7] – Toad Data Modeler, part 1

Toad Data Modeler is quite popular design tool from Quest Software. The installation went smooth, and I finished this whole exercise within 15 minutes from downloading Toad — impressive.

As in all previous examples of the series, I am using the database described in the first post.

 

Step 1

Start the program and click on File –> Reverse Engineering.

 

Step 2

The Reverse Engineering Wizard opens, note wizard steps on the left pane. The Alias is a name given to a collection of all setup parameters for a specific database; including connection, which objects to reverse, options to use, etc. At this point we have no previously defined aliases, so simply click Next to continue and create a new one.


 

Step 3

For Data Source, select MS SQL 2005.  As you can see from the list, all mayor databases are covered. Note the DDL Script RE branch, Toad allows reverse engineering of a database from a DDL script file too.


 

Step 4

For the data provider I had only one option available – ADO.


 

Step 5

The connection details; Toad actually displays the connection string here. To create a new one, click on Data Link Properties.


 

Step 6

We can type a connection string here, but it is simpler to click on Build and let Toad take care of the syntax.


 

Step 7

For the provider, choose SQL Native Client. On the Connection tab, enter the name of the server or an IP address; fill-in the username and the password and select the database name. If the list is empty, try clicking on Test Connection first.


 

Step 8

Now we have the connection string, click OK to continue.


 

Step 9

Back to the main wizard form, simply click Next to continue.


 

Step 10

Data Migrator selection — a new concept, did not see this one in previous examples. However, the wording is self explanatory, so simply select Reverse Engineering from Database MS SQL 2005.


 

Step 11

A list of objects to reverse engineer, followed by a list of options. Select all objects and accept default setting from the list of options.


 

Step 12

Time to save all this settings as an alias, which can be quickly selected next time. Once saved, select the alias and click Next.


 

Step 13

First select the Schema and than tables to reverse engineer; click on Execute to start the process. The bottom part of the form shows log messages.


 

Step 14

Diagram on the left was automatically generated, obviously it needed some rearranging. After some tinkering with shapes, I got the one on the right.


 

This is enough steps for one post. In the next part we’ll look into the procedures editor, converting the relational to a logical model and reporting. To be continued.

 

Reverse Engineer a DB [6] – Oracle SQL Developer Data Modeling

Oracle SQL Developer Data Modeling is still in the early adopter release, so it is free to download. However, it will turn into for-cost option of the Oracle SQL Developer. It is developed in Java, so you will need the latest Java SDK. My version is 1.5.1 build 525; it is obvious that some things will change/improve in future versions, but the product is stable — I did not have a single crash. As in all previous posts of the series, I will use the same database, described in the first post.

The biggest problem I encountered was a driver for MS SQL Server; took me a while to get it working. To save you some time, here are the steps to take:

  1. Download the jtds driver;
  2. Install it as described in Help –> Data Modeling Options –> Third Party JDBC Drivers;
  3. Reboot.

 

Step 1

Once you have the driver installed; after reboot, start the program and click on File –> Import –> Data Dictionary.

 

Step 2

The import wizard opens. Note the four wizard steps: connect to database, select schema, select objects to import and generate design. The right-side pane contains already defined database connections, in this case it is empty. To add a new connection, click on the Add button; the New Connection Editor opens.

 

Step 3

First step is to select a database type; form the Type list select MS SQL Server 2005. At this version we have an option of Oracle, MS SQL Server, IBM DB2/UDB and JDBC/ODBC bridge for generic ODBC access. Keep in mind that the program uses Java drivers natively.


 

Step 4

Fill in all other connection parameters.


 

Step 5

The database is now present in the connection list. Select the database from the list and click on Test Connection, it should work.


 

Step 6

The connection for this example is actually to the MS SQL Server, which contains several databases. Select the database to reverse-engineer; all schemata below the database name are automatically selected.


 

Step 7

Which objects to import? Oracle allows for fine grain selection of what to reverse-engineer, objects are grouped into categories, each category on a tab. On the Tables tab I have selected all except the sysdiagrams table, and on the Procedures Tab all stored procedures starting with "p_"; as in most of previous examples.


 

Step 8

Review the summary and click Finish to start the reverse-engineering process.


 

Step 9

The diagram one on the left was automatically generated, I have rearranged shapes to get the one on the right. When moving shapes around, make sure to disable the Line Auto Route option, found under Tools –> General Options –> Diagram. The graphical part is still a bit rough and slow, but will probably be improved by the time the program reaches commercial release.


 

Step 10

To move to the Logical level (model) select Design – > Engineer to Logical Model.


 

Step 11

Again Oracle offers a detailed list of options — what to engineer. The logical model is on the right. For this level Oracle offers Barker and Bachman notations, the example is in Barker (crow feet) notation. It is easy to switch, right-click on the diagram and select a different notation;

 

Step 12

The navigation pane shows object for both, logical and relational levels. Explore the objects to see mapping; table – entity, column – attribute, foreign keys – relations.


 

Step 13

Still have to find my stored procedures somewhere; they are located in the physical model. To view the physical model — with database objects — select Physical –> Open Physical Model from the menu, and then select MS SQL Server from the list.


 

Step 14

Expand the Physical Models branch in the navigator and double-click on a procedure name to open the Stored Procedures Editor.


 

Reverse Engineer a DB [5] - ERwin

ERwin Data Modeler is a full-blown data modeling tool, allows for modeling on logical and physical levels, forward and reverse engineering of databases and data warehouses.

Step 1

Start ERwin and click on Tools –> Reverse Engineer.


 

Step 2

Select Logical/Physical for the New Model Type, and SQL Server for the Target Database.


 

Step 3

Under the Owners option, type in dbo; leave all other options as per default. Find some time to return to this dialog and explore all the options for reverse-engineering.


 

Step 4

The SQL Server Connection dialog opens, type in (or select) all the connection parameters and click Connect. The process starts as soon as the connection is established, you will see a progress bar and staus display during the reverse engineering process.


 

Step 5

Here is the first diagram, pretty good. Time to rearrange shapes and change few display options.


 

Step 6

The diagram on the left is on the Physical level and the one on the right is on the Logical level. You may have noticed that ERwin did not offer a list of tables, but has included all the tables owned by dbo, including the sysdiagrams table — in previous examples I have excluded the table. This is actually a correct behaviour, because the sysdiagrams is a part of the database and ERwin is a bi-directional tool, so it can be also used for database maintenance and refactoring.



Step 7

Expand Stored Procedures branch on the Model Explorer; make sure you are on the Physical level. Right-click on a procedure as select Properties.


 

Step 8

The SQL Stored Procedure Editor opens. All procedures staring with "p_" are mine, while those starting with "sp_" are part of the SQL Server diagram package, together with the sysdiagrams table. It is interesting to compare the Code and the Expanded tabs; with reverse-engineering, stored procedures are moved to model level — hence all the placeholders on the Code tab.


By poking around the Model Explorer, we can see all the reverse-engineered objects, including:

  • Tables
  • Relationships (foreign & primary keys)
  • Triggers
  • Stored procedures
  • Functions
  • Views
  • Logins
  • User IDs.

 

Reverse Engineer a DB [4] - Visio

The professional edition of MS Visio has a database reverse-engineering feature; see the edition comparison chart. If new to the series, you may want to scroll down to the first two posts of the series, where I describe the sample database and the ODBC connection setup. What follows is a step-by-step reverse engineering procedure.

Step 1

Start a new file by selecting File –> NEW –> Software and Database –> Database Model Diagram.


 

Step 2

Once a new file opens, give it a name and save the file. After that, select Database –> Reverse Engineer from the top menu bar.


 

Step 3

The Reverse Engineering Wizard opens. First we have to select a driver to connect to the database — select Microsoft SQL Server from the list.


 

Step 4

When the driver is selected, the Data Source list is populated with entries from the ODBC  list which use the driver. I have added the database to the ODBC list in a previous post — select CodeDB from the list.


 

Step 5

Enter the password and click OK to continue.


 

Step 6

Select which objects to reverse-engineer. Visio offers tables, views, indexes, constraints and code (stored procedures and functions). Select all and click Next.


 

Step 7

A list of tables to include; as in the previous example (post) I have skipped the sysdiagrams table.


 

Step 8

A list of stored procedures to include. There is a small trick here: procedures starting with ’sp_’ are internal to SQL Server. It is a bad practice to name your procedures using the ’sp_’ prefix, so mine all start with  ‘p_’.


 

Step 9

Select Yes to add object shapes to the current page.


 

Step 10

Review the summary and click Finish.


 

Step 11

The initially generated diagram (left) looks quite good. By rearranging shapes and tinkering with the display options, I have created the one on the right. When moving shapes around, make sure not to accidentally disconnect a connector line from a shape: the connector represents a relationship object between tables, and disconnecting a line may remove a foreign key from a table.


 

Step 12

To inspect table columns, keys, triggers etc., click on a table in the diagram or in the table list.


 

Step 13

To examine stored procedures, switch to the code tab and double-click on a procedure name; the code editor opens. Among other features, the editor allows for specification of a mirror file in which to store the code outside Visio.


 

Reverse Engineer a DB [3] - Visual UML

In the last two posts I have described an existing database and created an ODBC connection to it, now it’s time to actually do some reverse-engineering. For me, the primary purpose of reverse-engineering is to document a database — the first step in any kind of refactoring or upgrade.

This should not be too complicated, the database was originally designed using Visual UML. An object modeling tool, Visual UML is not just for databases, it generates or reverse-engineers Java, VB/VBA, C++, etc. It is great for getting handle on projects where developers did not have much time for documentation. I have an older version (2.9.2) which still works fine, a newer (5.3) trial version is available for download here.

Step 1

To start with a new model, select File —> New Model, enter the model name and click OK.

Step 2

Now it is time to select a target language for the model, we need the SQL Server.

Step 3

From the Tools menu, select Reverse Engineer –> Database

Step 4

From ODBC DSN select the CodeDB from the dropdown list. The list is populated from ODBC data sources and we have added the CodeDB database as an ODBC source in the previous post.

Step 5

Fill-in the User ID, Password and the Database name – make sure they match what was entered during ODBC source definition.

Step 6

A list of tables from the database appears. I have selected all tables except the sysdiagram table, this one belongs to the system.

Step 7

Select yes to create a diagram.

 

Leave everything default and click OK. A new diagram is created and is quite messy, so I will have to clean it up later.

 

Step 8

Let us look at the list of objects created.

  • Tables are created as classes;
  • Database is an equivalent of a package;
  • Associations are created out of foreign key constraints;
  • There is one Class Diagram object that was created during the last step.

Time to clean-up the diagram. After some re-arranging and selecting what to display and what to hide, here is the result – not bad. You can compare it with the original from which the database was created. There are some minor differences:

  • In the original model, the primary key constraint was hidden (display), so it does not show explicitly in table attributes.
  • I did not explicitly create indexes for primary keys  in design stage, they were automatically created when the database was created. I could have decided not to process indexes during reverse-engineering, or to simply hide them from the diagram.
  • Arrows are reversed. On the reverse-engineered model they follow direction of one-to-many relationships, by default the "many" side has an arrow. On my original model, I have set up arrows to point into primary key, the ‘1′ side of the relationship.  This is just my preference, similar to the key side of an SQL server diagram; arrow directions can be reversed by editing relationship properties.

My version of Visual UML did not offer to reverse-engineer stored procedures, however according to the website the latest version does support stored procedures and triggers.

When it comes to documenting databases, I find that understanding tables and relationships is the most important part, stored procedures and triggers can be uploaded using Management Studio or similar tool.


To be continued.

Reverse Engineer a DB [2] – ODBC setup

Database tools and applications frequently use ODBC for connection to databases. In the previous post I have described a database that we will play with. The database named CodeDB is located on a server named DAMIR5, running MS SQL Server 2005. This article provides a step-by-step procedure for creating an ODBC connection to the database.

Note: this setup is done on a "remote PC" (not the server) — the same PC that will be used to test all previously mentioned database tools.

Step 1

Under Administrative Tools select Data Sources (ODBC). The Data Source Administrator opens.
You may also use  Control Panel –-> Administrative Tools –> Data Sources (ODBC).

Step 2

The System DSN tab lists data sources available to all users on the machine, including services. This is where you would usually want to add a new source. If you want to create a data source for a current user only, use the User DSN tab. Click the Add button to create a new data source.

Step 3

Create New Data Source dialog opens. This dialog has a list of drivers for different databases, for MS SQL server 2005 use SQL Native Client. Select the driver and click Finish.

Step 4

A new dialog, for configuring the SQL Native Client opens.

  • Enter a name for the data source — I used CodeDB
  • Something for description
  • Server name or an IP address, for local database use "localhost"

Step 5

On the next dialog we define the user name and the password for the database, enter whatever is appropriate for your database.

Step 6

Select the default database and check the ANSI options.

Step 7

Almost there, leave everything as per default and click Finish.

Step 8

The summary, click on the Test Data Source button.

Step 9

And voilà, click OK to close the dialog.

Step 10

We are back at the beginning, this time the new CodeDB data source is listed under System Data Sources. Click OK to finish and exit the dialog. From this point on, this database connection is available to any application which can use an ODBC data source.

Reverse Engineer a DB [1]

In the next few posts I will tinker with database modelling and reverse-engineering tools. I plan to use: Visio, Visual UML, Erwin, Toad, Oracle Designer, DB Explorer, Squirrel, and maybe some more. A Google search for "database modeling tools" returns quite a list. But, one step at a time; first thing we need is a database.

A few years ago, I put together a simple code repository for one of my Excel VBA projects. The idea was to store separate chunks of code in a database and dynamically assemble the project code as required. The model I chose follows a typical book layout: sentence, paragraph, chapter, book. Originally I designed it in Visual UML — see the diagram — and implemented it in MS SQL Server.

Entity (Table) Book Equivalent Excel VBA Object
Snippet Sentence Chunk of code, one line minimum
Block Paragraph Procedure, Function, Comment block
Module Chapter Code module, Class, Form
Project Book VBA Project

 

Snippet is a chunk of code, minimum one line, up to a whole procedure. Block has many snippets, one snippet can belong to many blocks. For example, if a block represents a subroutine, a snippet may be ‘End Sub’. Block_mm_Snippet is a many-to-many relationship table and also holds the position of a snippet within a block. The Module table represents a code module, a class or a form in VBA. One module can have many blocks, one block can belong to many modules — keep in mind that not all modules have to belong to the same project. A project has many modules, a module can belong to several projects.

Although it worked fine for me, the model is far from ideal. It is obviously missing a few constraints. Snippet TXT field should be unique, so should relationship keys in Module_mm_Block and Project_mm_Module tables to prevent repeating a procedure within a module, or repeating a module within a project. I could also add version fields to Block, Module, and Project tables. It will be interesting to see how changes can be made in the physical or logical layer in a diagram and deployed back to the existing database.

To be continued.

 

Bulk change post categories in WordPress

The following code moves all WordPress posts from one category to another. Both categories must already exists in the database. The from_category is not deleted form the database, only posts are moved. Needless to say, do backup your database before trying this.
Make sure to specify category slugs (not names) in the first two lines of the code. Comments tell the story; you may also want to look at the database model.

-- Move all post from one category to another;
-- use category SLUG, not name.
-- Define categories here.
SET @from_cat_slug = _utf8'from_category_slug_here';
SET @to_cat_slug = _utf8'to_category_slug_here';
 
-- Fetch term_taxonomy_id for both categories.
SELECT @from_cat_tax_id := tax.`term_taxonomy_id`
    FROM `wp_term_taxonomy` AS tax JOIN
         `wp_terms` AS trm ON tax.`term_id` = trm.`term_id`
    WHERE trm.`slug` = @from_cat_slug
        AND tax.taxonomy = 'category';
 
SELECT @to_cat_tax_id := tax.`term_taxonomy_id`
    FROM `wp_term_taxonomy` AS tax JOIN
         `wp_terms` AS trm ON tax.`term_id` = trm.`term_id`
    WHERE trm.`slug` = @to_cat_slug
        AND tax.taxonomy = 'category'
 
-- Create table to hold post ids.
CREATE TEMPORARY TABLE `ds_cat_mrg` (`id` BIGINT);
 
-- List all posts linked to the from_category.
INSERT INTO ds_cat_mrg (`id`)
    SELECT `object_id`
    FROM `wp_term_relationships`
    WHERE `term_taxonomy_id` = @from_cat_tax_id;
 
-- Delete links between these posts and both categories.
DELETE FROM `wp_term_relationships`
    WHERE `object_id` IN (SELECT `id` FROM `ds_cat_mrg`)
    AND `term_taxonomy_id`
               IN (@to_cat_tax_id, @from_cat_tax_id);
 
-- Insert links to the new category.
INSERT INTO `wp_term_relationships`
        (`object_id`, `term_taxonomy_id`)
    SELECT `id`, @to_cat_tax_id FROM `ds_cat_mrg`;

 

Coral8 – Randomly Sampling a Stream

The example below randomly samples an input stream and inserts samples into a named window for later processing. A variable named Dice is "rolled" each time a new row enters the input stream. A row is copied into the window if Dice = 1.

-- Variable for random number
CREATE VARIABLE INTEGER Dice;
CREATE VARIABLE INTEGER DiceSides = 6;
 
-- Input stream
CREATE INPUT STREAM StreamIn
SCHEMA (
    Val INTEGER
);
 
-- Each time a row enters the input stream,
-- roll the Dice
ON StreamIn
SET Dice =
    TO_INTEGER(RANDOM() * TO_FLOAT(DiceSides)) + 1;
 
-- Some data entering the input stream
ATTACH INPUT ADAPTER RMG
    TYPE RandomTuplesGeneratorAdapterType
TO STREAM StreamIn
PROPERTIES
    RATE     = "20"
    ,ROWCOUNT = "600"
;
 
-- Named window for samples
CREATE WINDOW WR
SCHEMA(
    Val INTEGER
    ,Dice INTEGER
)
KEEP 150 ROWS
;
 
-- Randomly sample the input stream
INSERT
    WHEN Dice = 1 THEN WR
SELECT Val, Dice
    FROM StreamIn
;


Here is a result, out of 600 messages that entered the input stream, 96 ended in the window; I was using a six-sided die — not bad.

 

Bloated WordPress Database

It all started when Google crawler reported a few long-to-load pages on the site. After some poking around, I finally looked at the database. The wp_options table was three times bigger than the wp_posts table — oops. Seems that the wp_options is also used for the RSS cache — so it eventually bloats.

As a quick fix, you may simply run this query to get rid of the RSS cache entries.

DELETE
  FROM `wp_options`
  WHERE `option_name` LIKE 'rs_%'
  AND`autoload` = 'no'
  AND CHAR_LENGTH(`option_name`) > 25;

After that optimize the table using:

OPTIMIZE TABLE `wp_options`;

In my case, the wp_options table went from 950 kb down to 47 kb; almost 20 times smaller. For finer control of what to delete — and if you do not like SQL — try the WP-Options-Manager plug-in. The manager also allows cleanup of junk data, left behind by inactive plug-ins.

Somewhere around version 2.6, WordPress introduced auto-save and revisions — you may or may not like these. The following query lists all revisions in the wp_posts table.

SELECT *
  FROM `wp_posts`
  WHERE `post_type` = 'revision';

The auto-save and revisions can be disabled using this plug-in. And a final warning — before you start SQL scripts — make sure you have the latest database backup somewhere safe; bad things do happen.

Update:

The database backup dropped from 209 kb to 86 kb — by about 60%.