Learn to apply well-understood agile development practices to the database, using the open-source Liquibase tool to refactor your database schema in a controlled, incremental fashion.
32. DATABASE REFACTORING
A simple change to a
database schema that improves
its design while retaining both its
behavioral and informational
semantics.
I haven’t written a lot of Java in the past year. Mostly these days I write Groovy.
I participate in the local development community by serving on the boards of www.denveropensource.org and www.iasadenver.org.
The August Technology Group is my consulting firm.
As software developers and architects, our worlds are filled with data.
Unfortunately, the data is not always in the condition we’d like. We try to do a good job structuring it, but the reality is that we often fail, and even when we succeed, the business changes enough so that even our successes are short-lived.
Our tool sets and processes have developed to deal expertly with rapidly changing code.
Our tool sets and processes have developed to deal expertly with rapidly changing code.
The same tools and practices have not been applied to data.
Leading to one of two problems...
The same tools and practices have not been applied to data.
Leading to one of two problems...
Will it die? It is said of evolutionary systems that things that fail to adapt to change die out. If only this were the case with the enterprise database...
Rather than going extinct, the production DBAs surround it like priests, carefully filling it with embalming fluid and wrapping it in linen strips. (And possibly putting some operations personnel in the tomb to serve the database in the afterlife.)
Inability to change and overprotection lead us to this dread antipattern.
There are long-lived Enterprise applications that rely on it, so it can’t just go away.
It isn’t really alive either, because business needs are constantly changing, but the database can’t change with them.
At least the production DBAs seem to think so!
We’ll figure out how to manage changes to the database such that we can make them with confidence in a way that brings the best of developer tools to bear and is relatively friendly to DBA workflows. We’ll do this with a mindset and a tool.
The fix is what Scott Ambler calls “evolutionary [or agile] database development.” This consists of five components.
Martin sez...
Martin sez...
Scott sez.... Code refactorings are really only concerned about behavior. DBs have behavior (stored procs, triggers, etc.) but also information. The database must say the same thing in the same way after the refactoring.
These are two mistakes here: one is thinking we’re smart enough to do all this designing correctly at the outset—we’re not. The other is thinking that the database’s business context is a static thing—it isn’t. There is no “right” up front, because requirements will change constantly.
Which doesn’t mean we can’t do a couple of days of designing at first; we can. It makes sense to try to anticipate what we can and make big, hard-to-change commitments correctly. We always expect change, though.
TDD adoption in software development is low enough, but it is virtually unheard of in database development. The tooling lags behind and the expertise is singularly rare.
There are ways to do it. The tools aren’t what they are for TDD of code, but there are options.
There are ways to do it. The tools aren’t what they are for TDD of code, but there are options.
There are ways to do it. The tools aren’t what they are for TDD of code, but there are options.
There are ways to do it. The tools aren’t what they are for TDD of code, but there are options.
Bring all the knowledge, practices, and advantages of software source control to the database. Simply control all those text files in SVN or Git like you normally would. This practice is old hat.
Every database artifact goes in to the repository.
Developers need a place to deploy refactorings when they’re trying to get their tests to pass. This must be a local database not used by any other team member or system.
If we learn how to refactor databases from Ambler and Sandalage, Liquibase is the tool that makes it easy. It is an XML-based love poem to Scott Ambler.
Liquibase is fundamentally a command-line tool written in Java. It uses JDBC to communicate with the database, and can coexist well in a non-Java shop. I can be invoked from popular open-source build tools and frameworks.
Liquibase is fundamentally a command-line tool written in Java. It uses JDBC to communicate with the database, and can coexist well in a non-Java shop. I can be invoked from popular open-source build tools and frameworks.
Liquibase is fundamentally a command-line tool written in Java. It uses JDBC to communicate with the database, and can coexist well in a non-Java shop. I can be invoked from popular open-source build tools and frameworks.
Liquibase is fundamentally a command-line tool written in Java. It uses JDBC to communicate with the database, and can coexist well in a non-Java shop. I can be invoked from popular open-source build tools and frameworks.
We’ll consider three aspects of Liquibase. How it stores the schema, how it interacts with the database, and how to use it in some real-world scenarios.
We’ll consider three aspects of Liquibase. How it stores the schema, how it interacts with the database, and how to use it in some real-world scenarios.
We’ll consider three aspects of Liquibase. How it stores the schema, how it interacts with the database, and how to use it in some real-world scenarios.
Must we rewrite our SQL in XML? We must. It’s painful and unappetizing, but worth it! Also, there’s a Grails plugin called Autobase that lets us do it in nice Groovy Builder syntax, which is preferable. We won’t address Autobase in detail here, but it’s worth looking in to.
Must we rewrite our SQL in XML? We must. It’s painful and unappetizing, but worth it! Also, there’s a Grails plugin called Autobase that lets us do it in nice Groovy Builder syntax, which is preferable. We won’t address Autobase in detail here, but it’s worth looking in to.
Must we rewrite our SQL in XML? We must. It’s painful and unappetizing, but worth it! Also, there’s a Grails plugin called Autobase that lets us do it in nice Groovy Builder syntax, which is preferable. We won’t address Autobase in detail here, but it’s worth looking in to.
The changelog is a script that builds your schema one DDL statement at a time. Each changeSet is converted into a dialect-specific SQL statement and is executed in the database, then marked as complete in a log table.
Extracts the database’s metadata and generates a changelog. This is the first step is getting started on a database that already exists.
Since Liquibase tracks the status of all changesets in the changelog, we’ll need to tell it that our newly extracted changelog is in sync with the actual database. This command usually shouldn’t be executed except as the second step getting started with the tool.
Plays any new changeSets in the changeLog against the database. This is the statement you’ll use the most during evolutionary database development with Liquibase.
Marks the database for a future rollback.
Rolls back changesets to given tag.
Supposed to document the differences between the database and the changelog. I’ve not had good luck with this so far against MySQL and SQLServer databases.
Evolutionary Database Development is not controversial among software developers, because we are largely persuaded of the core propositions. Moreover, Liquibase seems like a reasonable tool to help. Now let’s talk about some practices to put in place in a brownfield effort.
People are using it, it works, it has literally years of knowledge and business process encoded in it. It’s also ugly and outdated, and its schema seems to have been built by the DBA Club after school. But you’re replacing it, with Grails...!