About 5 years ago, I decided I would try to see how much of the Model Driven Architecture trend was hype and what was reality. Then (as now) I had been attending conferences where the claims seemed outrageous to me.
There were those that claimed that programmers soon would be obsolete. Soon computer software would magically appear from domain models and they would be so much better than what a programmer could produce (I even heard someone talk about how we could feed in natural language use-cases and the MDA tool that he was building would produce wonderful systems). I knew most of this talk of snake-oil-salesmen.
There were others that claimed that MDA was utterly impractical and that nothing good would come from it. The languages were already at its peek in terms of balancing expressiveness and efficiency.
I knew the truth had to be somewhere in-between the two extremes and having about 15 years of experience with generative techniques, I thought I would have a good chance of finding the boundary.
This finally led me to create a tool called MCC (I first called it MDAC, Model Driven Architecture Compiler, but the OMG lawyers did not like my name, so I changed it to Model Component Compiler). The goal has not been to sell the tool, rather to use it as a research bed to push the envelope of the Model Driven ideas.
I’m now on the fifth generation of architecture and I think I’ve been able to push the envelope as originally planned. The tool now encapsulates almost all the knowledge I have and what other people generously have offered as optimizations through the 5 years.
Anyway, the idea of this blog was not to talk about MCC but about pushing the envelope. I have for a long time kept a list of the techniques that I’ve learned by reading books and having experts criticizing the generated implementation. I’ve always found that I can encode the knowledge learned into the tool, but today I reread a book called Programming Perls by Jon Bentley and I found a nice optimization example that I’m pretty sure I’m not going to be able to formalize.

The book opens with a nice example where someone was asked to sort a rather large file of telephone numbers (all of them 800 numbers) when having only a few megabytes of main memory available (but plenty of disk space). The programmer also had severe performance constraint and was asking for Jon Bentley for help…

Now think about this for a second or two…. How do you do this? Which search algorithm? Merge sort? Quick Sort?
How do you split the files? How many batches?

Bentley came up with a really elegant solution. Because we know that all the numbers start with 1-800-, we can reduce it to its 7 digits. The presence of seven digit numbers can be expressed in a bit array where every possible 7 digit number is represented by a bit (hence, we need 10^7 bits, or if you want a bit more than 1 MB of memory).
We can now read the input file and set the bit for every number that we see, e.g., if we have the number 1-800-192-1234, we set the bit with offset 1921234. When we have read through the input file, we start reading off the bits from the proper end and for every bit set we write the corresponding number.

Now, HOW do you encode this kind of cleverness in your tool???
0

Add a comment

About Me
About Me
Blog Archive
Subscribe
Subscribe
Links
Subscribe
Subscribe
Loading