HTTP Decision Graph Comes to Life

Sometime last year, a flowchart describing the processing of an http request (as rendered by Alan Dean) made its way around the net.


I thought, “Wouldn’t it be cool if your webserver could plot the path it took on that chart in handling a request?”


Well, now it can – if you’re using the latest Webmachine. The above graph is scaled down from a tool I created that is packaged with the just-released Webmachine 1.0.

You can read all about the tool on the Webmachine Debugging page, but the basic idea is: draw an edge between each node in the graph that is traversed during the processing of a request. Clicking on decisions brings up a panel that details what resource module functions were called at that decision (as well as what their parameters and return values were). There are also panels for the details about the request itself (method, path, headers, body) and the response (code, headers, body).

I’ve put up three example traces from BeerRiot’s beer_resource:

  1. The first hits /beer/1, and runs successfully all the way to 200: trace-example-200.html
  2. The second hits /beer/9999, which doesn’t exist, and runs to 404: trace-example-404.html
  3. The third hits /beer/536, which used to exist, but has since been merged with beer 202, so it runs to 301: trace-example-301.html

You’re floored, right? You just can’t wait to get your hands on it, right? </modesty> Well, doing so is easy. Once you have your Webmachine resource written, just change your init/0 function from:

init(Config) ->
   {ok, Config}.


init(Config) ->
   {{trace, "/tmp"}, Config}.

Recompile and reload the module, then issue an HTTP request to it. You should see a file show up in /tmp with the extension .wmtrace.

Now open the Erlang shell for your webmachine application and type:

wmtrace_resource:add_dispatch_rule("wmtrace", "/tmp").

Tab over to your browser and hit /wmtrace/. You should see the list of .wmtrace files in your /tmp directory. Click on one of them, and you’ll be in the trace inspection utility.

For any pre-webmachine-1.0 users, getting access to this utility requires converting your resources to the new webmachine with referential transparency, but I can tell you from experience that that process is largely mechanical, and not that time consuming. I translated BeerRiot’s ~7000loc in about 2 hours (including testing).

I’d love to hear feedback about the trace tool. It’s the product of about three days of hacking (one proof-of-concept, one nearly-complete-rewrite, one actual-improvement), so I’m not hard-set on much of anything.

‘This’ is not what you think it is

It took me a little while to get used to Erlang. It was a couple of days before I knew the syntax without looking at examples. Several more days before I knew what utility functions to expect, and what their parameters should look like. A month or two before gen_server and gen_supervisor clicked.

Another language has taken me longer to learn, though: Javascript. Sure, it was easy to just write some C-esque, Java-like syntax for simple calculations. Objects were pretty straightforward, as long as I didn’t do anything complicated. But when it got to lots of event handling, I found myself nervously copying the few examples that I knew worked. It took a thorough reading of Javascript: The Definitive Guide, and several intensive weeks of JS-coding for it to sink in.

I now blame my lost hours of Javascript paranoia on one word: this.

Javascript’s this is completely different from the this keyword of Java and C++ (the other two languages I know well enough to speak about that also have this keywords). Where Java’s this is the object in which the function is instantiated, Javascript’s this is the “execution context.”

For simple calls, like a.f(); Java’s and Javascript’s definitions are pretty much the same thing: this == a. In Javascript, though, you can say var q = a.f; q();, and suddenly this is not a, but rather the global context. To get this set to a again, you have to use q.apply(a);.

Even better – here’s a quick example:

js> var x = "global";
js> var f = function() { print("f: "+this.x); };
js> var a = {x:"object-a", f:f};
js> var b = {x:"object-b", f:function() { print("b.f: "+this.x); }};

js> f();
f: global

js> a.f();
f: object-a

js> b.f();
b.f: object-b

js> f.apply(a);
f: object-a

js> a.f.apply(b);
f: object-b

js> b.f.apply(a);
b.f: object-a

js> var q = b.f;
js> q();
b.f: global

If you’re as unfamiliar with this as I was, it’s best if you reason for yourself why each line prints what it does. The basic idea is that the output “function:context” tells you what function is executing (“f” or “b.f”), and in which context (“global”, “object-a”, or “object-b”).

If you’ve followed the example successfully, you can see that any function can be executed in such a way that this is whatever you want it to be. It doesn’t matter if the function is defined “in” some other object; if it can be referenced, it can be applied to the object of your choice.

I woke up to the operation of this in a backward way: I first learned that Javascript was lexically scoped. That’s right, just like other great languages, as soon as you type function() { }, you “draw a double-bubble”, and every time you execute it, you “drop a box”. More to the point, Javascript supports closures, just like Scheme*!

“So, if Javascript has closures, what’s this for?” I asked myself. It seemed like a strange loophole – a way to define scope in direct conflict with the rest of the language. I shunned this. I didn’t need it. I knew how to operate in syntactic scope.

It wasn’t until I dug back into my event-handling code that it dawned on me: it’s the interaction between the two “context” systems that makes things interesting. Most Javascript event code (from DOM to Google Maps) executes the listener/handler functions with this set to the object that triggered the event. That’s right – you get references both to the context in which the function was defined (implicitly through scope) and to the context in which the function should be applied (explicitly through this).

For example, try out this code (assuming you have both jQuery and firebug installed):

Update: I’ve put up a version of this code here which puts the “console” on the page, so you don’t need firebug or your own jQuery.

    <script type="text/javascript" src="jquery-1.2.6.js"></script>
    <script type="text/javascript">
function logClick(bc, bn, gc, gn) {
   console.log(["You've clicked",bc,bn,"times",
                "- of",gc,"total for",gn].join(' '));

$(function() {
    $('div').each(function() {
       var groupCount = 0;
       $(this).find('button').each(function() {
          var buttonCount = 0;
          $(this).click(function() {
      <div>group a</div>
      <div>group b</div>

What you get is two “groups”, with two buttons each. Each time you click a button, the name of the button, as well as the number of times it has been clicked are printed on the console. Just to make it a little more interesting, the name of the group, as well as the total number of clicks of buttons in that group are also printed.

Two closures give us local storage for click counts: the outer one closes over groupCount, while the inner one closes over buttonCount (and also over groupCount by transitivity, of course). jQuery’s each function applies its argument to each element in the matched list, so this is set to each div in the outer iteration, and this is set to each button in the inner iteration. jQuery’s click function registers its argument in such a way that the handler gets applied to the element that triggered the event, so this is once again set to each button element in the inner-most function.

The end result is four functions registered as click handlers, each attached to a different button, each closing over its own buttonCount, and two pairs of those functions closing over the same groupCount. When the handler is called, this is set to the button, and we can extract text from the page for logging.

Yes, this is a trivial, contrived example that’s not good for much on its own, but this technique can be applied to any place a callback is used: iteration constructs and DOM events (as shown above), XHRs, Google Maps events, etc. Need to share a reference between several callbacks (or several executions of the same callback), but don’t want to pollute your namespace? Closures are the answer. Need to know why your callback is being called? this to the rescue.

A coworker and I were discussing all of this the other day, when the question arose, “Why a magical this? Why not just use the calling convention that ‘this’ is the first parameter of the function?” Well, I’m still trying to come up with a good answer for that one. In truth, many libraries do also pass an argument to the callback function that is equal to this. My strongest feelings so far: standardization and syntactic sugar.

By standardizing that this always means this very specific thing, we dodged the bullet of some libraries supporting it, while others don’t, while still others call it something else. Sometimes you just have to make a decision to ensure that it has been decided.

Without this defined as-is, the syntax for “applying foo’s bar member to foo” becomes; why name foo twice? In addition, all function signatures now become function bar(this, x, y, z, ...), whether they’re interested in this or not.

As I dive into playing more with prototypes, I expect to find additional interesting qualities to the use of this, but as yet, I can’t speak to them.

I’m sure this has all been written before (I still haven’t read all of the Douglas Crockford text that has been recommended to me), but I was just so excited to find that Javascript was a more interesting language than I originally thought, that I had to write about it. Maybe if enough of us say it, we can prevent future programmers from wasting their time on the same misconceptions.

* “Scheme is like Victoria’s Secret: It’s supposed to be elegant, but really it’s just dirty,” is printed next to my picture in my yearbook. I’ve since married the woman who said it, and learned the error of my ways. Likely not related events, but who can say for sure?

Like/Shrug/Dislike Gains Traction

Steven Frank agrees that 3-choice voting (like/shrug/dislike) is plenty. It’s great to see someone else suggest it. Maybe we’ll see it pop up in a few more places.

BeerRiot solves the specific problem Steven is grappling with (the sea of “average” ratings) in another way, as well: all BeerRiot ratings are personalized. You’re less likely to have to deal with a mess of vague ratings on BeerRiot because a social clustering algorithm is being used to sift the wheat from the chaff. BeerRiot can figure out which users’ votes are more important for your score, and uses them to give you a clearer picture. This means the score you see is much more likely to be in-line with your impression of the beer, not just the “average community rating.”

For those interested, here is my reasoning for keeping the ‘neutral’ vote around, in opposition to John Gruber’s suggestion.

Doing it Live

We have a phrase around the office: “Do it live!” It comes from the incredible freakout of Bill O’Reilly. We use it to mean something along the lines of, “This is a startup. The plan might change at any time. Changes go to production when we need them to, and we roll with bugs as best we can.” Far from encouraging careless, fickle choices, it’s a reminder that the camera is on, we’re live, and we are actively developing a product that is under close scrutiny.

Luckily, we have the power of Erlang behind us. The dynamic nature of the language and runtime is a fantastic fit for an environment in which things may change at a moment’s notice.

Erlang’s dynamic nature also came in useful for me on BeerRiot last night. I’ve blogged about hot code loading before, but last night I dipped into the world of OTP applications and Mnesia.

I realized late yesterday afternoon that I had left the login code in a state where usernames were case-sensitive. People could have signed up as “Bryan” and “BRYAN”, even though I already owned the login “bryan”. Basically, I was lazy; the username lookup code was roughly:

%% Name is the test username as read out of the http request
    fun() ->
        mnesia:match_object(#person{name=Name, _='_'})

What I needed to do was downcase both the test name and the stored name, and compare those results. I could have just tossed in a call to string:to_lower and reloaded the login module, except that I’m trying to support UTF-8 everywhere. To downcase a UTF-8 string, I needed another library (because I’m not going to both implementing my own).

Google pointed me in the direction of Starling. Despite the strange build process[1], starling provides an Erlang interface to the ICU libraries, to enable unicode manipulations. A quick build and test, and we have

LowerName = ustring:downcase(ustring:new(Name))

Toss an application:start(starling) in the BeerRiot startup code, and everything’s set to go … but why would I want to restart the webserver? Restarting is lame – we’re doing it live!

Instead of restarting, we’ll connect to the webserver through an erl shell (see my earlier hot code loading post about doing this) and modify the running system. We just need two simple commands to get this done.

1> code:add_paths(["/path/to/starling/ebin"]).
2> application:start(starling).

Command 1 tells Erlang to add a path to its library loading search. Command 2 starts the starling application. Starling is now up and running, and we can ustring:downcase/1 as much as we want.

But, I really don’t want to downcase every stored username every time. It’s also kind of nice for people’s usernames to display as they typed them, but not require the same capitalization in their login form. So, I’ll need to store the downcased version somewhere, in addition to keeping the original. I could put it in a new table, mapping back to the persons table, but it’s person data – let’s keep it with the person.

I need to add a field to my person record. But if I do that, all of my code looking for a person record of the current format will break. I need to update all of my person records in storage as soon as I load the code with the modified person record definition.

Mnesia gives us just the tool for this: mnesia:transform_table/3. All we have to do is provide a function that knows how to translate an old person record into a new one. Something like this will do:

%% old def: -record(person, {id, name}).
%% new def: -record(person, {id, name, login}).
add_login() ->
        fun({person, Id, Name}) ->
            {person, Id, Name, ustring:downcase(ustring:new(Name))}
        record_info(fields, person).

Stick that code in the person module, where the person record is defined. Now, connect back to the webserver and simply:

3> l(person).
{module, person}
4> person:add_login().

There’s a short period of time in there, between the ends of commands 3 and 4 where any code that looks up a person record will break. But, it’s short, and the entire rest of the site will continue functioning flawlessly.

And that’s the amazing power of Erlang. A very brief, very limited hiccup, and new functionality is deployed. Assuming the appropriate code was put in place to start everything up on restart, the system will come up in exactly the state you want it if the server should ever reboot.

Now back to tinkering… 🙂

[1]I oughta ‘make‘ you ‘rake’ my lawn, which you’re on, by the way, sonny.


A year in the making, almost completely rewritten, I can’t bear to hold it back any longer: today I release the new BeerRiot. Here’s a synopsis of the changes for you:

Old Tech New Tech
Erlyweb (Yaws) Webmachine (Mochiweb)
Erlydb + MySQL Hand-coded models + Mnesia + Apache Solr
ErlTL jQuery

I’ll probably write a blog post about each of those rows sometime in the near future. It should be said though, the my motivation in this rewrite was not to abandon Erlyweb. Rather each piece was a deliberate attempt to get practice on something we were using at work. Erlyweb’s great, but Webmachine has a different feel. MySQL can store data fine, but it’s quite different from the key-value store I hack against all day. ErlTL’s pretty nice as templating languages go, but I needed more DOM experience.

Luckily, the new technologies are also very nice. Webmachine forces you to become more familiar with the ins and outs of HTTP, but after writing a few resources, it becomes natural and quick to create new ones. Storing data in Erlang format in Mnesia is heavenly, and Solr has drastically improved search functionality. jQuery makes JavaScript in the browser far less painful.

In doing this rewrite, it’s been a real eye-opener to dig back through my early Erlang code. It wasn’t terrible, but having worked with other serious Erlang hackers all year, I notice the difference in the code I write now. The site should be much more stable now – Local/maps may even stay up for more than an hour. 😉

In that vein, though, I request that you not judge the JS running the site too harshly just yet. Just like my early Erlang was ugly, I can now tell that my early JS was ugly as well. That will be getting some cleanup soon, but I just couldn’t stand delaying the release for it.

So, go poke it and let me know what you think!

WWW is back

Unfortunately, I only found out this morning that some of you probably thought I had given up on BeerRiot. A month or so ago, I started hosting another domain on the same server, and thought I’d set up Yaws to handle that properly. Unfortunately, I botched it bad enough that while still worked, didn’t. Sorry!

Anyway, things are fixed now, so if you like prepending “www.”, welcome back!

You may have also questioned the lack of new feature announcements of late. The reason behind this is a giant rewrite I’ve been working on all year. More details in a few weeks, but it’s almost an entire redesign that touches every level. At the same time, work has been crazy busy all year. I’m excited to share the new stuff with you all, though, so keep me in your RSS feed, so you don’t miss the announcement.

Webmachine Released!

One of the cool technologies I mentioned in my last post has just been released open-source. Webmachine is a nice framework for creating web-friendly resources. We use it as the engine for serving our dynamic web content. For a bit more color to the description, read Justin’s post.

I like it so much I may even attempt a BeerRiot port at some point…

Erlang at Basho

Hmm…so, six-ish months ago, I posted that I had just spent a month working at a new job, where I got to code Erlang all day, and that I had a crazy month coming up, which would mean less attention here. That ‘month’ extended itself through the summer, and is going to continue through the fall, but I wanted to give you all an update.

Of course, anyone who really wanted to dig has already found that my new all-Erlang-all-the-time job is at Basho Technologies, but here’s an announcement in a little more obvious place. We’re creating a web-based system for salespersons and sales organizations to track, analyze, and improve their process. We are not just another CRM, and do not intend to become one. We’re providing real tracking, guidance, and analysis, not just an ‘online rolodex’.

But, enough about the business, you all want to know about the tech we’re using, right? Okay, so it’s not quite all-Erlang-all-the-time. Depending on the week, I spend anywhere from 40-90% of my time in Erlang, but because we’re providing a web service, I spend the rest of the day in HTML (or our template/rendering language of choice), Javascript, or CSS. There may be a time at which I’ll have to add Java and/or Flash to that mix as well, but for now, they’re staying down the hall.

Erlang provides the logic for basically all of our server-side code. The webserver, mochiweb with a nice logic layer overtop code-named “webmachine” (which you’ll probably hear more about in the coming months), is entirely coded in Erlang, and accepts Erlang modules as “resource providers.” The storage system uses components from the distributerl project to provide a distributed storage solution. Even our template/rendering language is interpreted by Erlang code.

At this point, I’m sure there are a few of you with your jaws hanging open, minds full of protests and questions. “What about the lack of strings?” “What about the syntax?” I tell you now: these are not problems we’ve stumbled over.

“Lack of strings” is a complete misnomer. There are lists, iolists, and binaries. Through combinations of the three, we get everything we need. We do what coders have been doing for decades: we wrap tricky, repetitive, often-needed code in libraries and move on. Honestly, we’re having more trouble with the standard datetime tuple (no standard for timezone marking, confusing timezone expectations of conversion functions).

To truly get an idea of how much trouble the syntax is, I recommend you try this excercise: code in Erlang-only for a few weeks, then try coding in something else, like Javascript. I think you’ll realize that each language has its own bunch of syntax quirks.

We were able to run the exercise above in our first few weeks of coding. Many of us had never touched Erlang, and some had never touched Javascript. Each language took a couple of days for the learner to be able to produce something useful, and another week or so for the same learner to be able to do it on their own in a reasonable amount of time. We do have complaints about the way each language is written now, but they share little with the complaints we had when starting out.

Case in point: records. Everyone’s favorite structure to beat up (after strings, of course). At first they seem like overly-verbose, half-baked forms of structures from other languages. Then comes the epiphany: they’re not designed to take the place of those structures. The best way to think about records is to envision C-structs full of void pointers. Then you can realize that their real power is in pattern matching – function guards, selective extraction, format assertions.

But just describing the ways in which Erlang hasn’t hindered us isn’t very impressive – how has Erlang actually helped us? Well, we’ve been able to implement our own home-grown distributed storage system in just a couple of months. We’ve also been able to create an entirely new design of webserver that makes writing truly RESTful webservices simple and quick. We’ve been able to troubleshoot problems by quickly opening shell connections to the running webservers and learning exactly what is going on. We’ve written lots of extra self-contained services that can export, import, analyze, and otherwise munge live data easily. In short, we’ve been able to iterate quickly over lots of proposed solutions to problems and choose based on experience, rather than weak assumptions and hearsay.

Or, maybe it’s just that I work with an amazing team that’s able to roll with punches quickly, and usually end up with a better solution in the end than the one we designed in the beginning. Could be – I know this team is the most talented I’ve met since college. 🙂

Sitting here almost eight months since starting Erlang full-time, and almost two years since picking it up for personal projects, I’d say that I’m here to stay for a while yet. I doubt neither that there will be another language in the future that will steal my attention, nor that there are other languages I’d choose for other types of projects. But, for distributed web-based services today, Erlang has my vote.


Woah – so much for February, eh?

In case people are wondering what’s up in BeerRiot land, here’s the skinny:

January and February were crazy months. I switched day jobs when I switched the calendar, and that meant a lot of extra planning in January, and a lot of extra concentration in February.

Unfortunately, March probably won’t be much different around here. My new day job has a big deadline on April 1, and we’re probably going to be at a dead run from now until then.

But, lucky for me, this doesn’t mean I have to give up Erlang – quite the opposite! In fact, my day job now has me writing Erlang all day long. 🙂 At some point in the future, I hope to be able to reveal where I’m working and what I’m working on, but that’s under wraps at the moment (it’s a new venture that doesn’t want the cat out of the bag just yet).

In the meantime, I do still log in to BeerRiot every night to keep things running. So, if you’re still looking for good beer recommendations, keep checking in.

Vimagi on Erlang2facebook

Erlang2facebook continues to gain users. The lastest is Yariv’s Vimagi Paint! Ignore my scribblings, and give it a whirl. There’s some really amazing work up there. (And, nice job, Yariv!)

For anyone else playing with the library, you might want to sync with the repository. Yariv’s prodding caught a couple of bugs, whose fixes were committed a few days ago.