jump to navigation

Java’s Long.toBinaryString(long l); come on, guys! 21 October 2009

Posted by manniwood in Uncategorized.
2 comments

So I’ve been doing some bit fiddling in Java, and because I don’t do a lot of bit fiddling, I want to print out the bits of longs so that I can get some feedback.

So I set up a long like so:

long someLong = 1L;

And I ask Java to show me each individual bit like so:

System.out.println(Long.toBinaryString(oldhash));

And here’s what Java outputs.

1

Thanks for printing the other 63 zeros there, Java. Great effort.

Happily, I like collecting programming books, and John W. Perry’s Advanced C Programming by Example has a nice example of printing all the bits in short ints.

It’s actually kind of cool. First what you need is a way to test each bit in a series of bits. Let’s say you have the following byte:

00000010

and you want to test its second bit (from the end). You shift that bit to the end:

byte i = 2;  // i is 00000010
i  >>= 1;  // shift one place to get 00000001

If we were interested in the first (end-most) bit, we would harmlessly shift zero places:

i = 2;  // i is 00000010
i  >>= 0;  // shift zero places to get 00000010

If we were interested in the third bit, we would shift two places:

i = 2;  // i is 00000010
i  >>= 2;  // shift two places to get 00000000

You take advantage of the fact that byte’s binary representation of 1 is

00000001

so if you & together 00000001 with any other short, the first seven zeros are guaranteed to make your result have seven zeros, but the final 1 will either & together with a 1 to give you 1, telling you the last bit was set, or & together with a 0 to give you 0, telling you the last bit was not set.

// let's test the second bit:
i = 2;  // i is 00000010
i  >>= 1;  // shift one place to get 00000001
i &= 1;  // i is 00000001; the second bit was set

// let's test the second bit again:
i = 6;  // i is 00000110
i  >>= 1;  // shift one place to get 00000011
//   00000011
// & 00000001
// -----------
// = 00000001
i &= 1;  // i is 00000001; the second bit was set

So you can write a testBit function (sorry—method; Java doesn’t have functions) that tests the i-th bit of a byte like so:

// return 1 if bitToTest-th bit of val was set,
// else return 0
byte testBit(byte val, int bitToTest) {
    val >>= bitToTest;
    val &= 1;
    return val;
}

And you can write a toBinaryString method that uses testBit like so:

String toBinaryString(byte val) {
    StringBuilder sb = new StringBuilder(8);
    for (int i = 7; i >= 0; i--) {
        sb.append((testBit(val, i) == 0) ? '0' : '1');
    }
    return sb.toString();
}

And so you can print all the zeros in your bytes:

byte i = 6;
System.out.println(toBinaryString(i));

Which will print this:

00000110

instead of this:

110

Nice, eh?

Catching Up with Ted Neward 17 October 2009

Posted by manniwood in Uncategorized.
1 comment so far

Sometimes all I want to do in my blog is link to other blogs, like A Farewell to ORMs and ORMs are a thing of the past.

Not that ORMs are bad for all situations; just that they are not a panacea. I wonder what Ted Neward thinks of this continual rediscovery that ORMs have their issues?

ORM: Whatever Works 12 October 2009

Posted by manniwood in Uncategorized.
1 comment so far

Mwanji Ezana asked me in a comment on my previous blog post:

I think one of the major advantages of ORMs is their lazy-loading, caching and query-batching ability. It’s not just about generating a schema and queries.

In all your anti-ORM posts, I’ve never seen you mention these capabilities. What do you make of them? Are they unimportant to you?

(Thanks for reading, Mwanji!)

I have a few comments, and I’ll have to start with the definition of ORM itself: as I discussed in my previous blog post, ORM has come to mean a lot of things, including SQL mappers like iBATIS, which I personally don’t consider ORM, but which, apparently, much/some of the programming community does.

So I’ll repeat that I only dislike the ORMs that write SQL for me behind my back. For the rest of this blog entry, if I say ORM, just think ORM of the sort that automagically does things for you; not ORM that facilitates easier writing of SQL queries, such as iBATIS.

When it comes to lazy-loading, caching, and query-batching, I like them all! It’s just that I think they are all best when decoupled from ORM. Caching, in particular, is arguably something that you may want decoupled from your ORM.

For instance, Django, has separate ORM and caching mechanisms: caching even has its own standalone chapter in the Django book.

I think this makes sense: caching is for more than just database queries, so I think it’s a great facility to offer outside of an ORM.

I consider query-batching to be another facility that can just as easily be offered outside of an ORM. If anything, the most effective ways I know to do large batch jobs on RDBMSs is to use the tools offered by the RDBMSs to do so, rather than those offered by any ORM or library. RDBMSs’ batch tools usually work best from the command line, allowing you to not only avoid your ORM for batch jobs, but avoid the whole application altogether.

I think I’m partly anti-ORM for aesthetic reasons: I try to avoid impedance mismatches instead of embrace them.

Consider a project where an object model perfectly describes the business domain, but there’s a need to store the data in SQL, perhaps because of SQL’s ability to generate great reports, or whatever. It’s the classic object/relational impedance mismatch.

Some (most?) projects embrace this mismatch by using some sort of ORM to bridge the gap.

Personally, I’d see if I could find a really good object database so that I didn’t have to deal with the mismatch. Why not store my objects in an object database and not even have to deal with an RDBMS? Maybe there’s an object database out there that can still generate the reports I need, or do other things I though I needed SQL to do.

On the other hand, if it turned out that the project needed features only SQL can provide, I’d think long and hard about whether or not my business data really had to be an object model. Maybe I could use the relational data model after all? Maybe in my application code I could use lists of maps to represent my data (so it would be a lot like SQL tables and/or result sets) and avoid the impedance mismatch by essentially bringing my relational model up into my application layer.

But that’s just me. I’m heavily biased towards solving problems by not having to solve them in the first place: Got an impedance mismatch? Pick a side. Now you don’t have to liaise between two ways of looking at the same data, because you just eliminated one by deciding that the other was more important.

But I’m happy to admit to at least two things:

1) Not all developers have the elimination/simplicity bias that I have, and

2) Not all projects can just pick one paradigm and eliminate another.

The continued popularity of ORM must mean that a lot of people are using it in a lot of successful projects.

I may personally suspect that a lot of projects probably succeed in spite of ORM, but I have to admit that it’s only because ORM has never been a good fit for the project I’ve worked on, so that experience has influenced my thinking. But I think us programmers need to admit that it’s a big world of programming problems out there, and one size does not fit all.

If your project is doing great, and you’re using ORM, then in the context of your project, you are right, and I am wrong. I’m glad you’re not listening to my criticisms of ORM, and I’m glad you’re sticking with what works.

On the other hand, if ORM is not working for you, or it’s showing some strain, check out Ted Neward’s The Vietnam of Computer Science, or about 25% of the blogs I’ve ever written. ;-) You may find some observations that ring true, even if you continue to use ORM.

Brandon Bloom Nails it on ORM; and ORM’s Definition Has Grown 7 October 2009

Posted by manniwood in SQL.
2 comments

As readers of my blog know, one of my favourite blog posts of all time is Ted Neward’s The Vietnam of Computer Science, where he discusses what a quagmire ORM is. But it looks like the definition of ORM has changed since Neward blogged about it, and I’m at risk of attacking database tools I love dearly, because I don’t consider these tools to be ORM—but others do.

I find when talking to developer friends of mine, and reading blogs, ORM is being quietly redefined to include tools like iBATIS, which doesn’t refer to itself as (and which I do not consider to be) an object relational mapper. iBATIS refers to itself a SQL mapper. It can be used to map objects and relational database records, but it can alsobe used to return primitive types, data structures made out of simple String types, etc, etc. It also makes it trivial to call stored procedures, which strikes me as decidedly non-ORM-ish.

Back when Ted Neward wrote his blog entry, I think it was assumed that ORM meant Hibernate, or Django’s ORM—tools that wrote SQL for you automagically in the background. I don’t believe that a tool like iBATIS was considered ORM back in 1996. Yet, in common parlance, iBATIS, and tools like it, seem to be considered ORM, which I just don’t get.

When Brandon Bloom discusses the shortcomings of Django’s ORM in his blog entry ORMs and Declarative Schemas, he explicitly singles out “the schema-generative ORM paradigm” for criticism, as though Django-style ORM is a sub-set of all other kinds of ORMs—which I guess now it is under this more expansive definition, even though I think Bloom’s sub-definition of ORM would not have been required a few years ago.

The acronym “ORM” seems to have become sort of a blanket term for any tool that helps you liaise between an RDBMS and any non-relational, (usually) object-oriented language. This is a bit unfortunate, because I used to think of ORMs (and I believe Neward’s original blog entry) describes ORMs as specifically tools that automagically generate SQL code on the fly based on your object model and configuration files. It used to be tools that allowed you to write SQL by hand were not considered to be ORMs. But I guess now they are.

By this new definition, my own Pybatis could be considered an ORM, even though I expressly designed it not to be an ORM. (Though nothing in Pybatis would prevent you from building an ORM on top of it.)

I almost feel like going back and editing my old blog entries, because when I criticise ORM, I mean tools that auto-generate SQL, not tools like iBATIS or Pybatis that allow you to write SQL by hand. Note I don’t say “force you to write SQL by hand”. This I consider to be an important philosophical divide between what I used to consider ORM and what I consider not ORM: “let our tool write all that nasty SQL for you” (ORM) versus “let us make writing SQL easier” (not ORM).

Bloom’s article is well worth reading. It treads over ground I’ve covered in my own blog, and I’m not surprised any time somebody discovers that “the database itself should hold the authoritative schema, not a class declaration in the code.” I agree.

I have a feeling that this lesson will continually need to be learned. After all, Bloom is posting 4 years after Neward, but hte conclusions remain the same. The siren call of ORM (sorry—specific ORMs that write your SQL for you) is still strong.

Maybe I should take heart in the idea that ORM is getting redefined to include tools that encourage you to write SQL by hand and that use the RDBMS as the authoritative data model for your project. Perhaps I shouldn’t care what people call that activity, as long as it gets pursued. Still, I wish there was a different name for this practice. (NoRM?) I think calling it ORM muddies the waters a little, and hides important differences in philosophy.

I’m even happy to cede that in certain greenfield projects, traditional ORM might be the way to go. (Hey—it is quick to get a site up and running with Rails-style or Django-style ORM.)

Perhaps I will start calling my favourite method of liaising with the RDBMS NoRM.