Archive for the ‘Miscellaneous’ Category

NerdKit Gaming: Part 2

If you were interested in my last bit of alternative code-geekery, you may also be interested to hear that I’ve pushed that NerdKit Gaming code farther. If you browse the github repository now, you’ll find that the game also includes a highscore board, saved in EEPROM so it persists across reboot. It also features a power-saving mode that kicks in if you don’t touch any buttons for about a minute. Key-repeat now also allows the player to hold a button down, instead of pressing it repeatedly, in order to move the cursor multiple spaces.

You may remember that I left of my last blog post noting that there wasn’t much left for the game until I could find a way to slim down the code to fit new things. So what allowed these new features to fit?

Well, I did find ways to slim down the code: I was right about making the game state global. But, I also re-learned a lesson that is at the core of hacking: check your base assumptions before fiddling with unknowns. In this case, my base assumption was the Makefile I imported from an earlier NerdKits project. While making the game state global saved a little better than 1k of space, changing the Makefile such that unused debugging utilities, such as uart, printf, scanf weren’t linked in saved about 6k.

In that learning, I also found that attempting to out-guess gcc’s “space” optimization is a losing game. Making the game state global had a positive effect on space, but making the button state global had a negative effect. Changing integer types would help in one place, but hurt in others. I’m not intimately familiar with the rules of that optimizer, so it felt like spining a wheel of chance choosing which thing to prod next.

You may notice that I ultimately returned the game state to a local variable, passed in and out of each function that needed it. The reason for this was testability. It’s simply easier to test something that doesn’t depend on global state. Once I had a bug that required running a few specific game states through these functions repeatedly, it just made sense to pay the price in program space in order to be able to write unit tests to cover some behaviors.

So now what’s next? This time, it’s not much until I buy a new battery. So much reloading and testing finally drained the original 9V. Once power is restored, I’ll probably dig into some new peripheral … maybe something USB?

NerdKit Gaming

Contrary to the evidence on this blog, not all of the code I write is in Erlang. It’s not even all web-based or dealing with distributed systems. In fact, this week I spent my evenings writing C for an embedded device.

I’ve mentioned NerdKits here before (affiliate link). This week I finally dug into the kit I ordered so long ago, and took it somewhere: gaming.

The result is a clone of a simple tile-swap matching game. I used very little interesting hardware outside the microcontroller and LCD — mostly just a pile of buttons. The purpose of this experiment was to test the capabilities of the little ATmega168 (and my abilities to program it).

I’ve put the code on github, if you’re interested in browsing. If you don’t have a NerdKit of your own to load it up on, I’ve also made a short demo video, and snapped a few up-close screenshots.

What did I learn? Mostly I remembered that writing a bunch of code to operate on a small amount of data can be just as fun as writing a bunch of code to operate on a large amount of data. Lots of interaction with the same few bytes from different angles has a different feel than the same operation repeated time and time again on lots of different data. I also learned that I’ve been spoiled by interactive consoles and fast compile/reload times. When it takes a minute or more to restart (after power cycles and connector un-re-plugging) and I don’t have an effectively infinite buffer to dump logs in, I think a little longer about each experiment.

So what’s next? Well, not much for this game, unless I slim down the code some more. Right now it compiles to 14310 bytes. Shortly before this, it was 38 bytes larger, and refused to load onto the microcontroller properly, since it plus the bootloader exceeds the 16K of flash memory available. My first attack would probably be to simply move the game board to a global variable instead of passing it as a function argument. The savings in stack-pushing should gain a little room.

If I were to make room for new operations, then a feature that saved a bit of state across power cycles would be a fun target. What’s a game without a high-score board?

Reading Code: Use Your Verbs

I’ve been reflecting on code quality lately. Partly that’s because I’ve been reading far more code than I’ve been writing. Partly it’s because the most recent code I was writing was intended primarily for reading, and only incidentally for execution, in the most literal way: it was instructional, not application-supporting.

And so it is that I’ve recently reaffirmed my conviction that code’s quality is primarily a function of its readability. Readability is of primary importance because code must be able to be understood in order to be used, and the way to understand it best is to read it.

However, I think I can be more specific about one component of readability that holds sway over the rest: naming. Partially the quality of each name, but also the ratio of named to unnamed things. But most important of all, the ratio of named to unnamed verbs.

I first realized this several years ago, while hacking in the middle of a complex, distributed, Java-based system. At one point, I had spent days diving through spaghetti, and finally found the core of the system … and it was beautiful. Not just the best Java code I’d ever seen, but possibly the cleanest code, period. Comparing it to the ugly code I had dug through, I found that its cleanliness derived from the fact that each interesting operation (or “verb”) was segregated into its own named function. Some of those functions called others of those functions, but it was always just one operation described in each.

Later, coincidentally on the same project, I had reason to spend several weeks not in Java, a language I knew very well, but in Perl, Python, and Bash, languages with which I was less familiar. I wrote and modified code very carefully in those languages, making sure that I could test each step as I went along. As that bit of hacking finished, I returned to Java, and found that my style had changed. I was now writing Java in a very careful, easily-testable manner. When I stepped back, I realized that the easily-understood form of my new Java code shared something with that beautiful core I had found earlier: each function described exactly one operation.

I’ll demonstrate what I mean with a concrete example. The code below is very similar to code I was hacking recently. The labels have been changed to protect the innocent, even though I think the innocent is me.

The set_properties function expects a token and a collection of properties (keys with matching values) to store for the token. New properties should overwrite old properties of matching keys, but old values for keys that are not specified should remain unchanged. For example, if the token “foo” had the properties [{a,1},{b,1}], and I called set_properties with the new properties [{a,2},{c,2}], then after set_properties finishes, the token “foo” should have the properties [{a,2},{b,1},{c,2}] (the new values for a and c plus the old value for b).

set_properties(Token, NewProperties) ->
   OldProperties = get_properties(Token),
   NewKeys = [ K || {K, _} <- NewProperties ],
   FilteredProperties = [ P || P={K, _} <- OldProperties,
                               not lists:member(K, NewKeys) ],
   set_properties_internal(Token, FilteredProperties ++ NewProperties).
    
Fig. 1: The Beginning

The code in Figure 1 is where I started. This code is correct: it conforms to the spec given, passes all tests (indeed, has been in production, working, for over a year). But, it is also bad code. The hint why is the NewKeys variable. It has little to do with setting new properties; it’s merely an artifact of cleaning up old properties. It’s an indication that the two list comprehensions that reference it are really an unnamed verb separate from set_properties.

set_properties(Token, NewProperties) ->
   OldProperties = get_properties(Token),
   MergedProperties = merge_properties(NewProperties, OldProperties).
   set_properties_internal(Token, MergedProperties).

merge_properties(Keep, Toss) ->
   KeepKeys = [ K || {K, _} <- Keep ],
   FilteredToss = [ P || P={K, _} <- Toss,
                         not lists:member(K, KeepKeys) ],
   FilteredToss ++ Keep.
    
Fig. 2: Naming the Verb

I propose that the code in Figure 2 is an improvement upon the code in Figure 1. The set_properties function now says just exactly what it’s going to do: lookup the old properties, merge them with the new properties, and store the result. The details about how the merge is performed, the unnamed verb in Figure 1, have been relocated to a new function, named merge_properties. The intermediate list of keys is still produced, but it’s now obvious that it’s just part of the merging process.

set_properties(Token, NewProperties) ->
   OldProperties = get_properties(Token),
   MergedProperties = merge_properties(NewProperties, OldProperties),
   set_properties_internal(Token, MergedProperties).

merge_properties(Keep, Toss) ->
   lists:ukeymerge(1, lists:ukeysort(1, Keep), lists:ukeysort(1, Toss)).
    
Fig. 3: Using an Existing Name

Figure 3 is a demonstration of part of the reason that MIT changed the 6.001 curriculum. There was no need to write those list comprehensions. Someone had already written the equivalent and named it. It is far clearer to use that named operation than to reimplement. The confusion about why NewKeys was created has been removed, and so has the need to decrypt the other list comprehension.

set_properties(Token, NewProperties) ->
   OldProperties = get_properties(Token),
   MergedProperties = lists:ukeymerge(1,
                         lists:ukeysort(1, NewProperties),
                         lists:ukeysort(1, OldProperties)),
   set_properties_internal(Token, MergedProperties).   
    
Fig. 4: Breaking Context

It’s a valid question to ask why I didn’t recommend jumping straight from Figure 1 to Figure 4, instead of ending up at Figure 3. It’s true that Figure 4 is a large improvement on Figure 1, but the answer is that even though lists:ukeymerge/3 is a named verb, it’s a verb with less context than merge_properties in my module. The context is richer than this snippet suggests, because there is at least one other function in this module that needs to perform the same operation. Also, to reference 6.001 again, “Abstraction barrier!” Why does set_properties need to know the data structure I’m using?

set_properties(Token, NewProperties) ->
   set_properties_internal(
      Token, merge_properties(NewProperties, get_properties(Token))).

merge_properties(Keep, Toss) ->
   lists:ukeymerge(1, lists:ukeysort(1, Keep), lists:ukeysort(1, Toss)).
    
Fig. 5: Anonymous Nouns

Another valid question is why I didn’t continue on to Figure 5 after Figure 3. In truth, I did consider it. My eye sees less clutter, but having discussed this exact choice with many coworkers, I’ve learned that others don’t. It also goes against the grain of what this post has been advocating: while I’ve worked to name my verbs, Figure 5 anonymized my nouns. There’s a practical reason to keep names for nouns around: printf debugging. Unless I have a very nice macro handy that I can wrap around one of the function calls in-place in Figure 5, I’m forced to copy one of those calls to some other place, and possibly even give it a name, before I can wrap my print statement around it. In Figure 3, the names are already there; all I have to do is use them.

What else could be improved in Figure 3? Plenty: “merge” is a bit generic and over-used; “properties” is long, noisy, and redundant in-context. Is my omission of names for the sorted lists in merge_properties/2 hypocritical? Probably. Readability is a subjective, human measure. In multiple projects and languages, I’ve identified verb-naming as important in my judgement of a code’s readability. Maybe writing that fact down will help me remember to think about it in new code I write.

I’m Sorry (Maybe)

Why is it that embarrassing code has a way of sticking around? The specific variety of embarrassment doesn’t seem to matter (it could be hard to read, willfully inefficient, or just quirkily broken); all varieties live on equally well. Is it just that all code has a way of sticking around, and that we notice the embarrassing code more? Or is it that the embarrassing code is more likely to be written in those tough little corners that no one wanted to touch anyway, and still don’t want to touch now? I don’t know, but I do know every one of us has a few bits that we’d love to do over, if we could ever get the time to Do It Right.

I’m reminded of one of my most embarrassing bits every time I’m put on hold. The music comes on, I hear about three words, and then static. A couple of chords, more static. On and on.

The story of my embarrassment begins over ten years ago. The summer of 1999, I was interning at Lucent Technologies. It was my third summer there, and I was finally hacking on a product, not academic research (or IT upgrades, as my first summer had entailed).

The product was called Softswitch – an amazing new product in the early days of commercial IP telephony. The stack was some mix of C and Java, and there was a box humming somewhere with a connection to some corner of the phone system (at the very least the in-house ISDN). Interacting with telephones, over the internet, with software running on any old random box – wow![1]

My main task was helping to flesh out the add-on module system. “Flesh out” may be the wrong term. The goal of my work was more to experiment with the extension API they had created (known as the Programmable Feature Server), and to produce a demonstration of its capabilities, as well as to provide feedback about what was missing, rough, broken, etc. In the Web 2.0 world, I’d probably have been labeled “beta tester”.

Like most betas, the documentation was scarce. The rumor was that the lack of documentation was less intimidating for those that knew SS7 inside and out, but there was no way I was going to swallow that heap, and also produce something useful in three months.[2]

By late summer, I had implemented a fairly involved demo, boringly named ReminderCall. Dial in from any phone, navigate your way through Push-N-for-X menus, then eventually enter a time and record a message. At the time you chose, ReminderCall would dial your phone and play your message back to you. There was also a “web” frontend (either a Java servlet rendering HTML or a servlet talking to an applet; can’t remember which) for doing the same, as well as canceling or rescheduling pending reminders, if I recall correctly.

ReminderCall was a success. They liked it so much, they used it to demo Softswitch’s extensibility to MCI.

But it’s not the success that I intended to talk about here. The embarrassing code happened along the way to ReminderCall.

As a way to learn how to deal with audio streams, I first implemented another application with a somewhat smaller scope. Much like beginning to learn any display-based system by printing, “Hello World,” I began to learn this audio-based system by playing, “hello.” A few more hours of tinkering after that, and my application could also read key presses.

Polish things up a bit, and the first app I had ready was MusicOnHold. Being 17 at the time, all geek and zero taste, my demonstration music was none other than Sabotage by the Beastie Boys (light defense: it also happened to be one of the few songs I could find for download at the time, avoiding the lack of audio hardware in my workstation).

The nice thing about Sabotage is that it sounds like noise normally. Piping it over 8-bit (or less?) mono mainly just seems to change the timbre of the noise. It wasn’t until the boss asked me to find something more suitable for business-audience demonstration that it became apparent that the noise was part of the application. Glenn Miller’s In the Mood sounded better on every warped vinyl it ever graced. Dee-da-da-dee-kxhxhxhxhxhxhxhx-dee-dee-khxhxhxhxh.

There was worry, and hand wringing. Email went back and forth between us and the core Softswitch developers. Was it just Java unable to keep up (this was the 1.1 or 1.2 days, and I was still a n00b, after all)? Was it the interface to the switch? The network between the boxes? It’s true that the human voice requires less bandwidth to encode than something wide-frequency, high-dynamic-range like Big Band music, but I nevertheless tried re-encoding that song every which way. Things improved a bit, but still the static remained.

In the end, it was deemed more useful for me to press on and experiment with other features of the system, rather than muck about with this encoding trouble.

But there lies the perfect storm: an app no one really wanted to write, with a problem no one really wanted to touch, no one with the time to fix it anyway, and a flaw just embarrassing enough for me to remember it years later.

And now, every time I’m stuck on hold with static-filled music, I wonder whether someone just went ahead and packaged that MusicOnHold demo app with the Softswitch, and thereby forced my old, embarrasing code public. If that’s the case, then, I’m sorry, so kxhxhxhxhxhxhxhx.

[1] I used the department’s mail server for a time, to the chagrin of not only the admin, but another user trying to use that server to host their Netscape Navigator process.

[2] My mentor was also probing the reaches of the API, implementing the required wiretapping features, as I recall. She also gets credit for being the first person to introduce me to Emacs and OOP (by way of Java), not to mention a host of other enlightenment. Many thanks if you’re reading this by some chance!

NoSQL EU: Key-value Stores and Riak

I’m very excited to announce that I’ll be speaking at no:sql(eu). I’ll be covering Key-value stores and Riak. The talk should be a good overview of this [very] broad domain of datastores, as well as a closer look at a few unique features of some specific implementations.

I’ll also be teaching a Riak workshop on the last day of the conference. I plan to cover the design, implementation, and deployment of a simple wiki-like application. It should be a good introduction to simple Riak usage (just storing and fetching data), while also exposing some advanced features (like link-walking, map-reduce, and conflict-resolution).

Looking forward to meeting people there!

Padding Quietly Down the Hall

I haven’t posted here in a long time. I’ve wanted to. I have several posts partly written, just waiting on getting the last bits of example nailed down. But, as you can see, I haven’t finished the polishing I feel is necessary before posting them.

So, in an effort to jump-start my return to regular blogging, I’m going to do what everyone else is doing: I’m going to yammer about the iPad for a few paragraphs.

I am not, however, going to make some drooling prediction about it changing the world. I am also not going to make some frothing statement about how clueless Apple was to leave out my dream feature. I am not even going to pontificate about whether or not I’ll be buying one (or who else I think should or should not buy one).

Instead, I’d like to point out two things about the iPad that, I feel, have been underconsidered. Those two things are price and file-sharing.

Price

$499. Five hundred dollars less than what seemed to be the most popular pre-announcement prediction. This is amazing because it hits many sweet spots.

Five hundred dollars is basically Geek Toy money. No, it’s not impulse-buy, “I tossed it in my cart to get free shipping at Amazon,” money. But, for your typical, gadget-loving geek, ~$2^9 is, “Yeah, I was thinking about sampling the market anyway.” That means it’s going to have myriad creative eyes and brains contemplating all sorts of mixed-up, new, different uses from day one. By day thirty, I guarantee you will see a demo of something surprising.

Woah. Almost drooled a bit there. Calming down now.

Five hundred dollars, or more specifically sub-1k, is also the price that pundits have been demanding from Apple. There’s always been the Mac mini, but that’s not portable, and its sub-$1000 price really depends on you already owning a keyboard, mouse, and monitor (or doing your own bargain hunting). The iPad now opens the doors to people who want a real Apple computer for half the price of a MacBook (ignoring resale and discount programs).

File-sharing

Yes, a real computer. How can I say this with a straight face? It’s all because of a new feature in the SDK: file-sharing.

The iPhone platform, until now, has not been designed for content creation. Consume all you want, but only produce short emails and the occasional snapshot. This means that it wasn’t a problem that there was no good way to transfer the content you produced onto and off of the device. The small bits produced went out in email. The things consumed, weren’t edited. (A few apps, like the excellent GoodReader, built in HTTP servers, just to get around this trouble.)

The iPad, though, has space and even dedicated hardware to provide UI for content creation. Indeed, Apple has ported the entire iWork suite to the device! In order for this to be of any use to anyone, though, there also needed to be a way to transfer that content elsewhere. Luckily, there is, in the form of a folder that each application can expose, which shows up as a directory on a drive when the iPad is plugged into a computer, just like any other USB drive. There is even a facility for asking other applications on the device to open files from the shared directory. No more bouncing things through remote network machines, just to get data moved between apps, or between the iPad and a desktop.

So that’s it. It’s cheap, and it can do more stuff. In a few months, I expect to see something really interesting.

Follow

Get every new post delivered to your Inbox.