Webmachine Released!

One of the cool technologies I mentioned in my last post has just been released open-source. Webmachine is a nice framework for creating web-friendly resources. We use it as the engine for serving our dynamic web content. For a bit more color to the description, read Justin’s post.

I like it so much I may even attempt a BeerRiot port at some point…

Erlang at Basho

Hmm…so, six-ish months ago, I posted that I had just spent a month working at a new job, where I got to code Erlang all day, and that I had a crazy month coming up, which would mean less attention here. That ‘month’ extended itself through the summer, and is going to continue through the fall, but I wanted to give you all an update.

Of course, anyone who really wanted to dig has already found that my new all-Erlang-all-the-time job is at Basho Technologies, but here’s an announcement in a little more obvious place. We’re creating a web-based system for salespersons and sales organizations to track, analyze, and improve their process. We are not just another CRM, and do not intend to become one. We’re providing real tracking, guidance, and analysis, not just an ‘online rolodex’.

But, enough about the business, you all want to know about the tech we’re using, right? Okay, so it’s not quite all-Erlang-all-the-time. Depending on the week, I spend anywhere from 40-90% of my time in Erlang, but because we’re providing a web service, I spend the rest of the day in HTML (or our template/rendering language of choice), Javascript, or CSS. There may be a time at which I’ll have to add Java and/or Flash to that mix as well, but for now, they’re staying down the hall.

Erlang provides the logic for basically all of our server-side code. The webserver, mochiweb with a nice logic layer overtop code-named “webmachine” (which you’ll probably hear more about in the coming months), is entirely coded in Erlang, and accepts Erlang modules as “resource providers.” The storage system uses components from the distributerl project to provide a distributed storage solution. Even our template/rendering language is interpreted by Erlang code.

At this point, I’m sure there are a few of you with your jaws hanging open, minds full of protests and questions. “What about the lack of strings?” “What about the syntax?” I tell you now: these are not problems we’ve stumbled over.

“Lack of strings” is a complete misnomer. There are lists, iolists, and binaries. Through combinations of the three, we get everything we need. We do what coders have been doing for decades: we wrap tricky, repetitive, often-needed code in libraries and move on. Honestly, we’re having more trouble with the standard datetime tuple (no standard for timezone marking, confusing timezone expectations of conversion functions).

To truly get an idea of how much trouble the syntax is, I recommend you try this excercise: code in Erlang-only for a few weeks, then try coding in something else, like Javascript. I think you’ll realize that each language has its own bunch of syntax quirks.

We were able to run the exercise above in our first few weeks of coding. Many of us had never touched Erlang, and some had never touched Javascript. Each language took a couple of days for the learner to be able to produce something useful, and another week or so for the same learner to be able to do it on their own in a reasonable amount of time. We do have complaints about the way each language is written now, but they share little with the complaints we had when starting out.

Case in point: records. Everyone’s favorite structure to beat up (after strings, of course). At first they seem like overly-verbose, half-baked forms of structures from other languages. Then comes the epiphany: they’re not designed to take the place of those structures. The best way to think about records is to envision C-structs full of void pointers. Then you can realize that their real power is in pattern matching – function guards, selective extraction, format assertions.

But just describing the ways in which Erlang hasn’t hindered us isn’t very impressive – how has Erlang actually helped us? Well, we’ve been able to implement our own home-grown distributed storage system in just a couple of months. We’ve also been able to create an entirely new design of webserver that makes writing truly RESTful webservices simple and quick. We’ve been able to troubleshoot problems by quickly opening shell connections to the running webservers and learning exactly what is going on. We’ve written lots of extra self-contained services that can export, import, analyze, and otherwise munge live data easily. In short, we’ve been able to iterate quickly over lots of proposed solutions to problems and choose based on experience, rather than weak assumptions and hearsay.

Or, maybe it’s just that I work with an amazing team that’s able to roll with punches quickly, and usually end up with a better solution in the end than the one we designed in the beginning. Could be – I know this team is the most talented I’ve met since college. 🙂

Sitting here almost eight months since starting Erlang full-time, and almost two years since picking it up for personal projects, I’d say that I’m here to stay for a while yet. I doubt neither that there will be another language in the future that will steal my attention, nor that there are other languages I’d choose for other types of projects. But, for distributed web-based services today, Erlang has my vote.

Vimagi on Erlang2facebook

Erlang2facebook continues to gain users. The lastest is Yariv’s Vimagi Paint! Ignore my scribblings, and give it a whirl. There’s some really amazing work up there. (And, nice job, Yariv!)

For anyone else playing with the library, you might want to sync with the repository. Yariv’s prodding caught a couple of bugs, whose fixes were committed a few days ago.


Disclaimer: The following is a diversion from BeerRiot. To all Rioters waiting on new features, I apologize. I can only claim temporary insanity due to cold.

I ran across two interesting Erlang posts recently. The first was to the Trapexit help form, where someone was attempting to implement a Brainfuck interpreter in Erlang, as a way of learning the language. I didn’t understand the question being asked, so I decided to give it a go myself, to see if I ran into a similar question.

A little while later, this is the implementation I had:


run(Program) ->
    run([], Program, [], 0, []).

run(Left, [$>|Right], LP, Data, RP) ->
    case RP of
	[D|Rest] -> run([$>|Left], Right, [Data|LP], D, Rest);
	[]       -> run([$>|Left], Right, [Data|LP], 0, [])
run(Left, [$<|Right], LP, Data, RP) ->
    case LP of
	[D|Rest] -> run([$<|Left], Right, Rest, D, [Data|RP]);
	[]       -> run([$<|Left], Right, [], 0, [Data|RP])
run(Left, [$+|Right], LP, Data, RP) ->
    run([$+|Left], Right, LP, Data+1, RP);
run(Left, [$-|Right], LP, Data, RP) ->
    run([$-|Left], Right, LP, Data-1, RP);
run(Left, [$.|Right], LP, Data, RP) ->
    run([$.|Left], Right, LP, Data, RP);
run(Left, [$,|Right], LP, _, RP) ->
    Data = io:get_chars([], 1),
    run([$,|Left], Right, LP, Data, RP);
run(Left, [91|Right], LP, Data, RP) ->
    if Data == 0 ->
	    {NewLeft, NewRight} = pass_match(91, 93, [91|Left], Right),
	    run(NewLeft, NewRight, LP, Data, RP);
       true ->
	    run([91|Left], Right, LP, Data, RP)
run(Left, [93|Right], LP, Data, RP) ->
    if Data /= 0 ->
	    {[91|NewRight], NewLeft} = pass_match(93, 91, [93|Right], Left),
	    run([91|NewLeft], NewRight, LP, Data, RP);
       true ->
	    run([93|Left], Right, LP, Data, RP)
run(Left, [X|Right], LP, Data, RP) ->
    run([X|Left], Right, LP, Data, RP);
run(_, [], _, Data, _) ->

pass_match(This, Match, Accum, Source) ->
    pass_match(This, Match, Accum, Source, 0).
pass_match(_, Match, Accum, [Match|Source], 0) ->
    {[Match|Accum], Source};
pass_match(This, Match, Accum, [Match|Source], Depth) ->
    pass_match(This, Match, [Match|Accum], Source, Depth-1);
pass_match(This, Match, Accum, [This|Source], Depth) ->
    pass_match(This, Match, [This|Accum], Source, Depth+1);
pass_match(This, Match, Accum, [X|Source], Depth) ->
    pass_match(This, Match, [X|Accum], Source, Depth).

Basically, a list for what is to the left of the current execution point, and another for what is to the right, as well as a list each for what is to the left and right of the current data point. Just tail-recurse through the right list (with a little extra jumping for the loop operators), pattern matching the opcode at the head of the right program list. Run the program by calling bf:run(Program) where Program is just a list of characters (including Brainfuck symbols if you want any result other than 0). For example, the following code will print out “Hello World” (found on the Wikipedia page).

[>+++++++>++++++++++>+++>+<<<<-] The initial loop to set up useful values in the array
>++.                             Print 'H'
>+.                              Print 'e'
+++++++.                         Print 'l'
.                                Print 'l'
+++.                             Print 'o'
>++.                             Print ' '
<<+++++++++++++++.               Print 'W'
>.                               Print 'o'
+++.                             Print 'r'
------.                          Print 'l'
--------.                        Print 'd'
>+.                              Print '!'
>.                               Print newline").

The second post I happened across was someone noticing the alternative way to make Erlang atoms (by enclosing characters in single quotes). Commenters were unhappy that they had never found a good use for functions named using these atoms.

Well, guess where that thought took me:


-export(['>'/2, '<'/2, '+'/2, '-'/2, '.'/2, ','/2, '['/2, ']'/2, stop/2]).

-record(tape, {left=[], current=0, right=[]}).

function_list() ->
    lists:foldl(fun({Atom, _}, Funs) ->
			case atom_to_list(Atom) of
			    [Char] -> [{Char, Atom}|Funs];
			    _ -> Funs
		end, [], proplists:get_value(exports, bf2:module_info())).

run(Program) ->
    Funs = function_list(),
    Atoms = lists:foldl(fun(C, T) ->
				case proplists:get_value(C, Funs) of
				    undefined -> T;
				    A -> [A | T]
			end, [], Program),
    [Current|Rest] = lists:reverse([stop|Atoms]),
    bf2:Current(#tape{current=Current, right=Rest}, #tape{}).

advance(Program, Data) ->
    [Next|Rest] = Program#tape.right,
    bf2:Next(#tape{left = [Program#tape.current | Program#tape.left],
		      current = Next,
		      right = Rest},

stop(_, #tape{current=Value}) ->

'>'(Program, Data) ->
    case Data#tape.right of
	[X|R] -> NewPoint = X, NewRight = R;
	_ -> NewPoint = 0, NewRight = []
    advance(Program, #tape{left = [Data#tape.current | Data#tape.left],
			   current = NewPoint, right = NewRight}).

'<'(Program, Data) ->
    case Data#tape.left of
	[X|L] -> NewPoint = X, NewLeft = L;
	_ -> NewPoint = 0, NewLeft = []
    advance(Program, #tape{right = [Data#tape.current | Data#tape.right],
			   current = NewPoint, left = NewLeft}).

'+'(Program, Data) ->
    advance(Program, Data#tape{current = Data#tape.current + 1}).

'-'(Program, Data) ->
    advance(Program, Data#tape{current = Data#tape.current - 1}).

'.'(Program, Data) ->
    advance(Program, Data).

','(Program, Data) ->
    In = io:get_chars([], 1),
    advance(Program, Data#tape{current = In}).

'['(Program, Data) ->
    if Data#tape.current /= 0 ->
	    advance(Program, Data);
       true ->
	    {Left, Right} = skip('[', ']',
	    advance(#tape{left=Left, current=']', right=Right}, Data)

']'(Program, Data) ->
    if Data#tape.current == 0 ->
	    advance(Program, Data);
       true ->
	    {Right, Left} = skip(']', '[',
	    advance(#tape{left=Left, current='[', right=Right}, Data)

skip(Up, Down, Acc, Src) -> skip(Up, Down, [Up|Acc], Src, 0).

skip( _, Down, Acc, [Down|Src], 0) -> {Acc, Src};
skip(Up, Down, Acc, [Down|Src], N) -> skip(Up, Down, [Down|Acc], Src, N-1);
skip(Up, Down, Acc, [Up|Src], N)   -> skip(Up, Down, [Up|Acc], Src, N+1);
skip(Up, Down, Acc, [X|Src], N)    -> skip(Up, Down, [X|Acc], Src, N).

Basically, create a function named for each Brainfuck operator. Then, convert all of the valid Brainfuck operators in the program into atoms, and use them to call the functions sharing their names. Run it just like the earlier example, bf2:run(Program).

Now, I’m not going to call the first implementation ugly. In fact, I think it’s a fair example of walking a list, doing different things depending on the value of the head of the list. But, I have to say that The second version does read a bit nicer, in some respects. (I also tried using the tape record in the first example, but I thought it made things worse.)

Yeah, okay, Brainfuck clearly still isn’t a great use for quirky-atom function names, but perhaps it represents some problem space that can make good use of them?

Anyone have a better neat trick – for either Brainfuck interpretation or funky-atom function names?

P.S. My apologies for not posting an answer to Alboin (the Trapexit poster). I can’t remember my login details for Trapexit. 😛

Disclaimer 2: WordPress really doesn’t like dealing with so many <s and >s. I think I got everything, but if something doesn’t work, that’s probably the culprit.

Denormalization, Processes

If you read the news, you’ll know that tuneups are happening behind the scenes of BeerRiot. If you came to this blog after reading that story, you’re wondering what, exactly, they are.

If I’m not feeling particularly communication-challenged, I’ll be able to explain them to you. 😉

The first tuneup is one every webmaster has heard of: denormalization. I had been using a view to select data from three tables with one call. The performance drag of that query was serious enough, though, that I’ve decided to complicate things a bit and copy the extra bits of data I need from the other tables into the main one for the query.

The speed gain is great, and, somewhat strangely, the denormalization actually cleaned up a bunch of my code. ErlyDB lacks a “one-to-one” relation, so it was impossible for me to say “each record in this view is really just a record in this other table with some extra data.” That made for a bit of hackery swinging from one type to another. Without that extra table, I think the code reads more clearly.

(Disclaimer: I’m far from being an relational database master, so it’s likely that there is a much better way to express everything I’m doing. But, I’m happy to be making what seems to be forward progress.)

The other main change is more Erlang-centric. Until now, I had been tracking sessions using a customization of the Yaws recommended session server. This is basically a central process that stores opaque data associated with an id string. Whenever your app gets a request, it pulls the cookie value out and checks with this central process to find out if there is any opaque data associated with this key. It works (quite well, in fact), but it seems like a bit of a bottle neck.

So, I’ve decided that there’s a more Erlangy way to do things. What BeerRiot is doing now is starting up a new process for each session, and saving that process id in a client cookie. Then, whenever a request comes in, if it has a cookie with a PID, we can try to contact that session’s handling process directly. No central service required.

It turns out that there’s loads of benefits to having this session hanging around beyond relieving the central service bottleneck. It can cache data, smartly (i.e. listen for updates, etc.). It’s a natural place to run background processes (like propagating live changes to durable storage). I see other potential uses, but since I haven’t tested them yet, I’ll hold my tongue to avoid getting too many hopes up. 😉

For Facebook developers: This process-session system wasn’t possible until just a few weeks ago, when Facebook started supporting cookies on the canvas page. Unfortunately, they only support them for canvas requests, and not for their “mock ajax.” For mock ajax, I’ve decided to just encode the cookie values in post parameters. It works (and it’s no more inconsistent than the rest of the Facebook Developer experience).

Update 2.Jan 18:52 EDT: If you spent any part of today poking at BeerRiot to see how the speed-ups turned out, you were probably rather dissatisfied. I just figured out that I didn’t fully rollout the update. 😛 It’s there now, and I think you’ll be much more impressed.

Erlang Facebook Code Example

I’ve finally prevented distraction long enough to finish an example use of the Erlang Facebook library I posted earlier.

If you grab the source from the erlang2facebook project, you’ll know find it comes with a bunch of stuff in an “erlprints” directory. The code in “erlprints” is a near literal translation of the “Footprints” app that comes with the standard Facebook PHP library.

It’s not perfect, and there are certainly places where more Erlang-ish style could have been used, but I hope it’s good enough to give people a clue to how to use the library.

You’ll need to setup Erlang, ErlyWeb, and MySQL (not to mention getting a Facebook account and adding the developer app) before starting.

Good luck!

Erlang Facebook Code Open-source

Hi all. I’ve been meaning to do this for a while now, and the requests are only becoming more frequent, so – my Erlang-Facebook bridge code is now open for use. You can get it from the erlang2facebook Google Code project.

Big warning: the main reason I wasn’t releasing this code yet is because I don’t feel that it’s documented well enough. Anyone interested in using this code will likely need to have both the Facebook doc pages and the standard Facebook PHP code open, for comparison.

I’ve been working on recreating the sample “footprints” app to package with this code. I’ll post it as soon as I do (I keep getting distracted).

Also forgive me if I’ve committed some terrible Google Code faux pas. It’s my first project hosted there, so I’m sure I missed something.

Two Birds, One Hot Code Load

I bet there are a lot of people still questioning the utility of hot code loading. Especially in the web app field, it can seem a little gratuitous. PHP apps don’t need any special hot load facility – the script just gets reread from disk every once in a while.

Well, even if we ignore that there are likely parts of web apps that do need to run all the time, and are not just executed at request time, there’s still the web server to think about. And, guess what I did last week.

Yaws provides lots of nice utility functions. One in particular is yaws_api:htmlize/1, which takes an IoList as an argument, and returns the “same” list with the four big offenders (ampersand, double-quote, less-than, and greater-than) replaced with their HTML entities (&amp; et al.). This function does exactly what you need when serving HTML directly to a modern web browser.

Unfortuntately, htmlize/1 doesn’t work perfectly when sending “FBML” to Facebook. During Facebook’s translation, it converts all characters with ASCII values greater than 127 to unrecognizeable characters, which come out as some form of “?” in a browser.

The fix is simple – just HTML-encode all characters over 127 as HTML entities of the form &#X;, where X is the decimal representation of the ASCII value. In Yaws 1.68, just add the following two lines just before yaws_api.erl:590 (the line with the guard for integer(X)):

htmlize_l([X|Tail], Acc) when integer(X), X > 127 ->
    htmlize_l(Tail, [$;, integer_to_list(X), $#, $&|Acc]);

Compile the new code by running make in the base directory of the Yaws source. If you’re running Yaws from a directory other than the source directory, copy ebin/yaws_api.beam to that other ebin directory.

Edit: See dbt’s comment for a way to skip the next paragraph in the simple case.

Now for the magic. My prefered way to load new code is to open a console to the web server’s erlang node. First, run “erl -sname Name”, where Name is any name other than that of your webserver. Once erl starts up, type C-g (control-g, for you non-Emacs-ers). You’ll be asked for a “User switch command”. Typing “h” will get you help here, but what you actually want to do is type “r yaws@host”, where “yaws@host” is the node name of the webserver’s erlang node. Typing “j” should now show two shell sessions. Connect to the second one with “c 2”.

Now that you’re connected to your web server’s erlang node, just type “l(yaws_api)” to load the new code. Any module calling into the yaws_api module will now automatically use the new code. Meanwhile, any code that was in the middle of a yaws_api module call will finish the call with the old code.

Edit: You won’t need the next paragraph either, if you followed dbt’s instructions.

When you’re done mucking about (I know you’ve just spent the last half hour figuring out what other bits of your webserver you can touch from here), type C-g again, then kill the remote console with “k 2”. Connect back to your first console with “c 1”, then exit it in the normal manner.

So, voila, Á now comes out of htmlize/1 as &#193;. International beer names show up properly in Facebook, and (oh boy!) BeerRiot Local now works in IE6 (which couldn’t parse those letters from the XML tag file). Two birds, one stone, I love it.

On a couple of side notes:

Thanks for the versioning system suggestions. I’ve settled on Mercurial for now, and I’m quite happy with it so far. Bit of a pain upgrading Python versions, but I probably should have done that long ago anyway.

And, I’ve been doing more than just building a website around beer this summer. I’ve also been growing my own ingredients! I made my first hop harvest earlier this week. It was only an ounce wet, which turned into about 1/8 oz. dry, but I was proud to have some success anyway. Here’s the proof:

Cascade hops growing in my back yard