Applied design patterns: Database abstraction (part 1)

June 9, 2008 – 8:21 pm

In some of my previous posts, I’ve pretty much ranted about the poor design qualities of some people that enjoy calling themselves PHP programmers. In these two posts however, I’d like to show you what good and proper design is. The topic is the same as that in my previous posts: databases. Or, to be more precise, database abstraction.

In the first part, we’ll transform a concrete set of database interaction classes into an abstract set, to improve flexibility and extensibility. In the second part, we’ll introduce a design pattern which will help us to manage and maintain the new abstract database classes and their concrete implementations. We’ll do part one in this post.

With database abstraction, I mean in this case that you write your database-driven program in such a way, that you’ll never really have to know what database you’re using. Consider the following example, which might not be the most neat example, but it’ll do for now:

function do_something()
{
// establishes a database connection
$databaseConnection = new MySQLDatabaseConnection($server, $username, $password);
$databaseConnection->SetDatabase('someDatabase');
// assemble the query (we'll do it directly now btw, ignore this example)
$query = new SelectQuery("SELECT * FROM 'someTable'");

$result = $query->Execute($databaseConnection);

// do something with the result, you get the idea.
}

The above example is actually pretty neat, if I say so myself. Feel free to replace it with either calls to the MySQLI class or direct mysql_ function calls, it’s not really important in this case.

Okay, you’ve made a web application, and the app is littered with calls to get data from your MySQL database, awesome. Now, however, you read somewhere that PostgreSQL is a lot faster / better in some cases, and you want to go and convert your entire application to use PostgreSQL instead of MySQL. In a 20-file or less application, this is still doable - open the .php file, find all calls to execute database queries, and replace them with a Postgres equivalent - which, outside of the actual syntax of the query (which I’m not familiar with btw), is pretty much the same (instead of mysql_query() you call pg_query(), for example).

But then you want to revert back after a few months. Great, re-do it all, again 20 times something of edits. This is getting tedious.

And it gets worse. What if instead of 20, you have 200 files (which isn’t uncommon in large websites)? What if you’re doing a research project to which database is more suitable for your application? You can either prepare for a lot of searching and replacing of similar-looking code, or, if you’re smarter, delegate that task to some moron who’s thinking he’s doing really important work.

If you’re teh smartest though, you’ll want to make your program in such a way that it won’t matter which database software you’re using. Lucky for you, there’s people who invented programming language features that allows you to do just that.

Let’s analyze the problem first. You have a database that contains data. Your program wants to create, read, update and delete (CRUD) data from that. Databases support all those four operations, without exception - that’s a constant, that’s a common feature, that’s important. The first step into abstracting a program or program structure is to discover what is common in the program. For databases, the common factors are:

1. You can connect / disconnect to them.
2. You can execute queries on them (CRUD queries).

That’s two common features all databases share. There’s a load more you can do with databases, but that’s the basics, that’s the very least we want to be able to do.

So, what’s this abstraction business again? We’ll want to make the calls to connecting to and executing queries on a database in such a way, that when we call it, we won’t know which database it’s actually performed on.

To do this, we’ll have to make a few things:

1. An abstract interface that contains the tasks you can perform on / with a database.
2. An implementation for each database type
3. A means to get a hold of the object(s) required to execute operations.

The first thing we need is an abstract interface. Think of an abstract interface as a list of requirements, a recipe, a guideline. In this interface, we define what we expect objects of this kind can do. We’ll first define an interface to a DatabaseConnection object. We want a DatabaseConnection to be established, closed, and for it to be able to execute queries. (In this example, we’ll pass the responsibilities to executing queries to the object that represents an active connection to a database. This might not be the most ideal or logical implementation, but eh, I can’t be arsed to think up something that makes more sense).

You’ll notice that there’s three function definitions in there, but without an implementation - the functions are there, but they don’t contain anything executable. In fact, calling them like this will result in fatal errors. But that’s not what this interface does - it provides a guideline, not an implementation. So, next we’ll create an implementation - we’ll do one for MySQL in this one.

class MySQLDatabaseConnection implements DatabaseConnection
{
	private $connection = null;

	public function __construct($host, $username, $password)
	{
		$connection = mysql_connect($host, $username, $password);
	}

	public function Execute($query)
	{
		return mysql_query($query, $connection);
	}

	public function Close()
	{
		if ($connection != null)
		{
			mysql_close($connection);
		}
	}
}

(it’s pretty basic btw, we’ll just assume we’re getting direct queries in this one. It also doesn’t really make sense since it looks pretty basic and not very advanced or complex, but eh, it’s just an example).

With the rules set by the interface, we can also implement a MySQLIDatabaseConnection that uses PHP’s MySQLI_ (MySQL Improved) functions, or a PostgreSQLDatabaseConnection that uses the Postgres functions. The point is, we want to make the actual connection establishing and query executing code invisible from the caller - we want to encapsulate those processes in defined classes.

So, in our own code, we can now just write a statement like the following:

$result = $query->Execute($databaseConnection);

And who knows what type $query is right now. MySQL? MySQLI? PostgreSQL? Ovrimos? The last one is a random one I took from the list at PHP.net, I have no clue what it is or how it works. But that’s the cool thing: it doesn’t really matter now. All you do is execute $query on a $databaseConnection, why should you care what it gets executed on? As long as the Execute method executes the query and returns a result, you don’t care what goes on behind it - or, better said, you shouldn’t care. By encapsulating the problem, you reduce or remove the need to know about the problem.

What’s next? There’s actually only one database-specific line in the example code at the top of this page:

$databaseConnection = new MySQLDatabaseConnection($server, $username, $password);

In order to solve that predicament, we’ll want to move the assignment of the $databaseConnection to a single, centralized location. In other words, we’ll want to move the creation of the concrete DatabaseConnections to a standard location. We’ll introduce the concept of a creational design pattern (the Factory Method) in part 2 of this tutorial.

Obligatory list of social network links:

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • del.icio.us
  • Google
  • E-mail this story to a friend!
  • Print this article!
  • Reddit
  • Slashdot
  • StumbleUpon
  1. 5 Responses to “Applied design patterns: Database abstraction (part 1)”

  2. Nice articles.

    I still try to figure out how your interface looks like.

    It has to be something like

    interface DatabaseConnection {
    public function Execute($query);
    public function Close();
    }

    But where does the connection fit?

    By zocq on Jul 1, 2008

  3. I’m not sure I understand your query entirely, but the class IS the connection. When you construct the DatabaseConnection class (the constructor doesn’t appear in the interface btw, left it out ’cause I’m not sure if a constructor for an object belongs in an interface), it will store a $link internally, i.e. a variable pointing to the result returned by the mysql_connect() method. The class itself will basically wrap around that result, the $link variable.

    By admin on Jul 1, 2008

  4. Thanks for your fast response. You’re absolutely right: the constructor doesn’t belong to the interface. And the class is indeed the connection (via its constructor). It was me mixing up things (feeling dumb).

    By zocq on Jul 1, 2008

  5. tests time mashine

    By zooppiemi on Jul 9, 2008

  1. 1 Trackback(s)

  2. Jun 9, 2008: Applied design patterns: Database abstraction (part 2) | For Great Justice

Post a Comment