jump to navigation

When Was the Last Time You Switched RDBMSs? 22 July 2009

Posted by manniwood in Programming, SQL.
trackback

It’s been pointed out by many people, including this humble blogger, that writing portable SQL, or worse, using an ORM to make it easy to switch RDBMSs, is about as silly as using only a subset of the features of Python or Ruby in case you have to switch between them, or back to Java.

Thomas Kyte is particularly lucid (not to mention entertaining) on this point, especially in the section “Openness” in Chapter 1 of Expert Oracle Database Architecture. He talks about a team that had committed to some very specific front-end technologies, but were worried about what would happen if they switched databases, and wanted to write the most portable SQL possible.

What really blows me away is that in the 13 years I’ve been building database-backed web sites, I’ve only seen a web site switch out its back end once (at my recommendation), but I have seen web sites switch their front ends plenty of times.

In fact, if I compare my latest front-end switch to the only back-end switch I’ve ever done, I’d have to say that they were both worthwhile, but that the back-end switch was easier.

With the back-end switch, I moved from a small Oracle database to a small PostgreSQL database, mostly for cost reasons (can’t beat the cost of PostgreSQL!). I was (and still am) a huge fan of Oracle, especially its locking model (writers don’t block readers / readers don’t block writers), and I found the same locking model in PostgreSQL. And trust me, this was going from Oracle-specific, leverage-every-Oracle-advantage, vendor-locked-in code to PostgreSQL-specific, leverage-every-PostgreSQL-advantage, vendor-locked-in code. The switch was way more straightforward than the front-end switch I did from Java/J2EE to Python/Django. (But I’ll repeat that both switches were worthwhile.)

But this was the only time I’d ever switched back ends, and I don’t think I’m alone in noting how rare it is to switch RDBMSs. Front-end switches are way more common.

Why?

I think I was listening to a stackoverflow.com podcast where Jeff Atwood said the stackoverflow data was their most important asset—more important than even their front-end code. I think most websites / businesses feel that way about their data. Their data help them make decisions, make money, etc, etc. The processes that act upon those data are important too, but subordinate to the data.

Same thing with a web site. The front end is obviously important (no front-end, no visitors!), but the data are king. I think this is why most web sites and businesses are very loathe to do anything that might disrupt the database. But the front end? There’s room to experiment there. In fact, most good databases end up with multiple front-ends, because their data are so valuable. And once you have multiple front-ends, you become even more loathe to change your database, because now more clients depend on it.

So not only does the oft-quoted ORM argument of “but what if you want to switch databases?” ring hollow to me, it actually strikes me as downright silly. Because front-ends, (you know, like where the ORM is kept?) change way more than back ends, and yet we still haven’t seen tools nearly as silly as ORMs to help people abstract away the front-end in case of an inevitable switch.

And if having no front-end abstraction layers has not meant the end of the world, I think we should do away with back-end abstraction layers, and other foolish attempts to write “platform independent code at all costs”. Swapping out the back end is less common than swapping out the front end. Or, to put it more bluntly, designing for the eventual swapping-out of the back end is a very unusual, perhaps even bogus, project requirement.

Comments»

1. Martin - 27 July 2009

Once again, I find myself falling under your influential thinking. :-)

I’ve been messing around with Django on my local box and it is cool and all, but I keep feeling the innate wrongness of the Django models defining the data model. If I decide to use the data from another application, that application will need to abide by whatever schema Django created. And if I update the Django app, it means the schema changes with it. Other apps may break. Not good.

In short, Django ( and Rails for that matter ) actually create tight coupling, so the argument of using an ORM for portability goes out the window. To be fair, I don’t think that is a goal of either framework, but the problem of coupling is there.

The thing that really rang out the point is where you said, “The processes that act upon those data are important too, but subordinate to the data.”

Indeed.