jump to navigation

SQL Generation Is a Templating Problem, Not a Code Generation Problem 2 July 2009

Posted by manniwood in Programming, Pybatis, Python, SQL.
trackback

The only library I’ve ever used that I think got SQL generation right was iBATIS. Every other SQL generator, or object-relational mapper got it wrong.

The two basic flaws of everything from LINQ to Hibernate to Django’s ORM are 1) the assumption that developers don’t want to write SQL code, and 2) the use of a lower-level language to generate a higher-level language. Using .NET or Java or even Python to generate SQL makes as much sense to me as using Assembler to generete C. Code generation of this sort only works going from high-level to low-level; not the other way around. The other way around, you are using a more verbose language to generate a less verbose language. It escapes me why it is not intuitively obvious that using any other language to generate SQL is stupid. Just learn SQL!

I can find hope in the fact that nobody hides HTML generation behind layers of abstraction anymore. But object-relational-mappers continue to hide SQL generation behind layers of abstraction. Do you remember the old days when HTML could be generated by tools like Microsoft’s Front Page? Remember CGI.pm, which turned every HTML tag into a method?


print hr; # prints <hr>

Not only does this make no sense, it’s insulting. It assumes I don’t want to, or can’t, learn HTML. The same insult persists to this day with ORM-based SQL generation.

Here’s what I want to do. Let’s say I have captured user input in a map called form_fields that has a field called ‘creation_date’. When form_fields['creation_date'] is true, I want to show the creation date of the 20 most recently created users in my database, otherwise, not.

Here’s what I want the HTML to look like, using the Jinja templating engine:


<table>
<tr>
{% if form_fields.creation_date == True %}
<td>Created On</td>
{% endif %}
<td>Username</td>
</tr>
{% for user in users %}
<tr>
{% if form_fields.creation_date == True %}
<td>{{ user.created_on }}</td>
{% endif %}
<td>{{ user.username }}</td>
</tr>
{% endfor %}
</table>

Here’s what I want my SQL code to look like:


select
{% if form_fields.creation_date == True %}
created_on as "created_on",
{% endif %}
username as "username"
from users
limit 20

Obviously, I want to use templates for SQL the same way I do for HTML. The only difference is that whereas I send HTML to a browser once my templating engine has worked its magic, I send SQL to my RDBMS once my templating engine has worked its magic. And all I want back from my RDBMS is lists of maps with the key names specified by my “as” column aliases in my SQL code. (See my own Objects are Hammers, or Jeff Atwood’s Why Objects Suck or Ted Neward’s The Vietnam of Computer Science for why I do not want to return a list of objects or even engage in object-relational mapping in any way, shape, or form. Just give me a list of maps, please and thank you.)

In fact, that’s exactly what Pybatis allows me to do: I loved the central idea of iBATIS so much, I ported it to Python and named it Pybatis as an homage to the original. There are some differences in Pybatis: instead of using iBATIS’ custom XML templating syntax, I used Jinja templates, so that I could use the same templating engine for my SQL code as I was using for my HTML code. What could be easier? And, whereas iBATIS assumes you want to map result sets to objects (but, happily, allows outputing maps), Pybatis assumes you almost always want a list of maps (list of dicts, in Python). In fact, Pybatis was so easy to write (the original version took an afternoon), I’d encourage you to build your own SQL templating system using your favourite templating engine. You’ll never use ORM again.

In the realm of HTML generation, sanity has prevailed: everyone has realised that HTML generation is not a code generation problem but a templating problem. The beauty of this approach is that you are writing HTML most of the time, and you only drop into the templating language when you need to introduce variables and loops to the static world of HTML.

It turns out, the same approach works best for SQL as well. Sadly, this is not universally acknowledged like it is with HTML, but I hope one day it will be. For now, there is iBATIS, and my Python port (and homage), Pybatis.

Comments»

1. Has It All Been Said by Others? « Manni Wood - 17 July 2009

[...] may think that SQL generation is a templating problem rather than a code generation problem, and Ted Neward may think that ORM is the Vietnam of computer science, but recently, Jeremy Zawodny [...]

2. John O'Hanley - 18 July 2009

I agree with you wholeheartedly. The ostensible reason ORM tools were created in the first place was to hide SQL. But why do I want to do that? What’s is wrong with SQL? It’s a ‘domain specific language’ that’s been widely used for many years, is widely understood, is usually easy to understand.

HTML is the domain specific language for a web interface, and SQL is a domain specific language for persisting data.