Archive for the ‘Development’ Category

Simple Webmachine Extension (4/4): DELETE

This is the third post in a four-part series about extending a simple Webmachine resource. The first part discussed adding support for the HTTP method PUT, and the second part added basic authorization, while the third part added conditional requests through ETag.

DELETE

I can modify variables through PUT; how about removing them altogether with DELETE?

First a little sidetrack. While Erlang exposes os:getenv/1 and os:putenv/2, it does not expose an os:deleteenv/1. So, what I’m going to do instead is adopt the convention that a variable set to “nothing” (i.e. the empty list) has been deleted. This requires a little modification to resource_exists/2 to filter out variables with empty-list values. (See the full source at the end of this post.)

Sidetrack #2: I also noticed that there was a bug preventing values from containing the equals sign, so while hacking resource_exists/2, I fixed that as well.

With the empty-list-is-deleted convention in hand, I can implement DELETE handling like this:

-export([delete_resource/2]).

delete_resource(RD, Ctx) ->
    os:putenv(wrq:path_info(env, RD), []),
    {true, RD, Ctx}.

I only want to allow one variable to be deleted at a time, so I’ll modify allowed_methods/2 such that it only adds 'DELETE' to the method list if the request path is of the form /_env/VARIABLE.

I also want DELETE to require authorization, so I’ll modify is_authorized/2 to watch for both PUT and DELETE.

And that’s that. I can now use the following curl command to delete MY_VAR:

$ curl -u webmachine:rules -X DELETE http://localhost:8000/_env/MY_VAR

Wrap Up

Before I get to the complete source, I want to take a moment to highlight something I stressed in an earlier blog post: Did you notice that none of the code I’ve displayed in the last few days has mentioned specific HTTP response codes?

The env_resource now runs all over the HTTP decision flowchart, returning everything from 405 when methods other than GET, HEAD, PUT, and DELETE are issued (or just the first three in the whole-environment case), to 401 when PUT or DELETE is attempted without the proper credentials, and 412 when ETags don’t match on a PUT (or 304 when they do match on a GET!), and yet, I didn’t mention a single code. I simply described the properties I wanted the resource to have, and let Webmachine do the translation.

I stress this because I think it’s important to consider the power of an HTTP translation system. I’ve seen reduced development time as an effect of not having to worry about getting to the right response code in every corner case, while still adding necessary headers. I’ve also seen reduced troubleshooting time as an effect of being able to read through a resource that only concerns itself with describing its properties, rather than including a lot of mechanics for tearing apart HTTP requests and building up HTTP responses.

Plug

If this series has raised your interest level in Webmachine, I recommend you attend Justin Sheehy’s talk at the Bay Area Erlang Factory.

The Complete Code

%% dispatch:
%% {["_env"],      env_resource, []}.
%% {["_env", env], env_resource, []}.

-module(env_resource).
-export([init/1, content_types_provided/2, resource_exists/2, to_json/2]).
-export([allowed_methods/2, content_types_accepted/2, from_json/2]).
-export([is_authorized/2]).
-export([generate_etag/2]).
-export([delete_resource/2]).
-include_lib("webmachine/include/webmachine.hrl").

init(_) -> {ok, undefined}.

content_types_provided(RD, Ctx) ->
    {[{"application/json", to_json}], RD, Ctx}.

resource_exists(RD, Ctx) ->
    case wrq:path_info(env, RD) of
        undefined ->
            Result = [ {K, string:join(V, "=")}
                       || [K|V] <- [ string:tokens(E, "=")
                                     || E <- os:getenv() ],
                          V /= [] ],
            {true, RD, {struct, Result}};
        Env ->
            case os:getenv(Env) of
                false  -> {false, RD, Ctx};
                []     -> {false, RD, Ctx};
                Result -> {true, RD, Result}
            end
    end.

to_json(RD, Result) ->
    {mochijson:encode(Result), RD, Result}.


%% PUT support

allowed_methods(RD, Ctx) ->
    {['GET', 'HEAD', 'PUT'
      |case wrq:path_info(env, RD) of
          undefined -> [];
          _         -> ['DELETE']
       end],
     RD, Ctx}.

content_types_accepted(RD, Ctx) ->
    {[{"application/json", from_json}], RD, Ctx}.

from_json(RD, Ctx) ->
    case wrq:path_info(env, RD) of
        undefined ->
            {struct, MJ} = mochijson:decode(wrq:req_body(RD)),
            [ os:putenv(K, V) || {K, V} <- MJ ];
        Env ->
            MJ = mochijson:decode(wrq:req_body(RD)),
            os:putenv(Env, MJ)
    end,
    {true, RD, Ctx}.


%% AUTH support

-define(AUTH_HEAD, "Basic realm=MyOSEnv").

is_authorized(RD, Ctx) ->
    case wrq:method(RD) of
        PD when PD == 'PUT'; PD == 'DELETE' -> basic_auth(RD, Ctx);
        _                                   -> {true, RD, Ctx}
    end.

basic_auth(RD, Ctx) ->
    case wrq:get_req_header("Authorization", RD) of
        "Basic "++Base64 ->
            case string:tokens(base64:mime_decode_to_string(Base64), ":") of
                ["webmachine", "rules"] -> {true, RD, Ctx};
                _                       -> {?AUTH_HEAD, RD, Ctx}
            end;
        _ -> {?AUTH_HEAD, RD, Ctx}
    end.


%% ETAG support

generate_etag(RD, Result) ->
    {mochihex:to_hex(erlang:phash2(Result)), RD, Result}.


%% DELETE support

delete_resource(RD, Ctx) ->
    os:putenv(wrq:path_info(env, RD), []),
    {true, RD, Ctx}.

Simple Webmachine Extension (3/4): ETags

This is the third post in a four-part series about extending a simple Webmachine resource. The first part discussed adding support for the HTTP method PUT, and the second part added basic authorization.

ETag

If I’m sharing management of a server with someone else, I want to be careful of overwriting that other person’s changes. For instance, I might only want to engage “DANGER_MODE” if I can be sure that “COAST=clear”.

I need to know that between my last GET and my next PUT, that the value of the “COAST” variable hasn’t changed. I can handle this simply by generating an ETag for the resource:

-export([generate_etag/2]).

generate_etag(RD, Result) ->
    {mochihex:to_hex(erlang:phash2(Result)), RD, Result}.

Now I can issue conditional requests. When I GET /_env the response will have an ETag header, which is reasonably guaranteed to change if the environment variables change. I can take that ETag and toss it back in an If-Match, and the request will only succeed if the environment variables are in the same state as I last saw them (because the ETag won’t match otherwise). That is:

$ curl -u webmachine:rules -X PUT -H "If-Match: LAST_ETAG" \
   -H "Content-type: application/json" http://localhost:8000/_env/ \
   -d "{\"DANGER_MODE\":\"engaged\"}"

will only succeed if the enviroment is in the state it was when I issued the request that returned Etag: LAST_ETAG (when, hopefully, I checked to make sure that the coast was clear).

Update: part four is up.

Simple Webmachine Extension (2/4): Authorization

This is the second post in a four-part series about extending a simple Webmachine resource. The first part discussed adding support for the HTTP method PUT.

Authorization

Something about modification of server state screams, “Password protection!” at me. Let’s guard the PUT method with Basic auth:

-export([is_authorized/2]).

-define(AUTH_HEAD, "Basic realm=MyOSEnv").

is_authorized(RD, Ctx) ->
    case wrq:method(RD) of
        'PUT' -> basic_auth(RD, Ctx);
        _     -> {true, RD, Ctx}
    end.

basic_auth(RD, Ctx) ->
    case wrq:get_req_header("Authorization", RD) of
        "Basic "++Base64 ->
            case string:tokens(base64:mime_decode_to_string(Base64), ":") of
                ["webmachine", "rules"] -> {true, RD, Ctx};
                _                       -> {?AUTH_HEAD, RD, Ctx}
            end;
        _ -> {?AUTH_HEAD, RD, Ctx}
    end.

Arbitrary decisions:

  • Only PUT is protected. If GET and HEAD should be protected as well, just replace the body of is_authorized/2 with the body of basic_auth/2

I need to update my curl command if I don’t want to be told I’m unauthorized:

$ curl -u webmachine:rules -X PUT -H "Content-type: application/json" \ 
   http://localhost:8000/_env/MY_VAR -d "\"yay\""

Come back tomorrow for part three, where I add a modicum of atomicity.

Update: part three is up.

Simple Webmachine Extension (1/4): PUT

I was in need of a break last night after flailing in the face of a new library-language-build-system for a few hours. So, I decided to hack some Webmachine (it was too late to sit down at the trap set).

I was thinking about how the os-environment resource from my last post could be extended. This post begins a four-part series in which new capabilities are added to env_resource.erl.

PUT

Let’s start with modification. Why not allow setting those variables?

I need to do three things: announce that the resource supports PUT, list what types of data it accepts, and then actually handle decoding the incoming data. Fifteen quick lines of code should handle it:

-export([allowed_methods/2, content_types_accepted/2, from_json/2]).

allowed_methods(RD, Ctx) ->
    {['GET', 'HEAD', 'PUT'], RD, Ctx}.

content_types_accepted(RD, Ctx) ->
    {[{"application/json", from_json}], RD, Ctx}.

from_json(RD, Ctx) ->
    case wrq:path_info(env, RD) of
        undefined ->
            {struct, MJ} = mochijson:decode(wrq:req_body(RD)),
            [ os:putenv(K, V) || {K, V} <- MJ ];
        Env ->
            MJ = mochijson:decode(wrq:req_body(RD)),
            os:putenv(Env, MJ)
    end,
    {true, RD, Ctx}.

Arbitrary decisions were made above:

  • Clients shall send JSON-encoded data. I could have just as easily added another element to the return value of content_types_accepted/2 and written another method to handle it (e.g. {"text/plain", from_text} and from_text/2). Accepting JSON is nice, since it makes PUT and GET simply symmetric.
  • /_env expects a JSON structure, while /_env/VARIABLE expects a string. Again with the simple symmetry between PUT and GET.
  • /_env only modifies or creates the variables described in the JSON structure it receives. Alternatively, it could have cleared out any unnamed environment variables, but this seemed unnecessary.
  • No body is returned in a successful response. It would have been fairly simple to generate the same body that would have been returned in a GET, then use wrq:append_to_response_body/2 to return a modified RD, but this also seemed unnecessary.

I can now set MY_VAR to "hello", using two different curl commands:

$ curl -X PUT -H "Content-type: application/json" \
   http://localhost:8000/_env -d "{\"MY_VAR\":\"hello\"}"

$ curl -X PUT -H "Content-type: application/json" \
   http://localhost:8000/_env/MY_VAR -d "\"hello\""

Come back tomorrow for part two, in which I’ll add authorization via username and password.
Update: part two is up.

HTTP Decision Graph Comes to Life

Sometime last year, a flowchart describing the processing of an http request (as rendered by Alan Dean) made its way around the net.

http-graph-small-unmarked

I thought, “Wouldn’t it be cool if your webserver could plot the path it took on that chart in handling a request?”

http-graph-marked-200

Well, now it can – if you’re using the latest Webmachine. The above graph is scaled down from a tool I created that is packaged with the just-released Webmachine 1.0.

You can read all about the tool on the Webmachine Debugging page, but the basic idea is: draw an edge between each node in the graph that is traversed during the processing of a request. Clicking on decisions brings up a panel that details what resource module functions were called at that decision (as well as what their parameters and return values were). There are also panels for the details about the request itself (method, path, headers, body) and the response (code, headers, body).

I’ve put up three example traces from BeerRiot’s beer_resource:

  1. The first hits /beer/1, and runs successfully all the way to 200: trace-example-200.html
  2. The second hits /beer/9999, which doesn’t exist, and runs to 404: trace-example-404.html
  3. The third hits /beer/536, which used to exist, but has since been merged with beer 202, so it runs to 301: trace-example-301.html

You’re floored, right? You just can’t wait to get your hands on it, right? </modesty> Well, doing so is easy. Once you have your Webmachine resource written, just change your init/0 function from:

init(Config) ->
   {ok, Config}.

to:

init(Config) ->
   {{trace, "/tmp"}, Config}.

Recompile and reload the module, then issue an HTTP request to it. You should see a file show up in /tmp with the extension .wmtrace.

Now open the Erlang shell for your webmachine application and type:

wmtrace_resource:add_dispatch_rule("wmtrace", "/tmp").

Tab over to your browser and hit /wmtrace/. You should see the list of .wmtrace files in your /tmp directory. Click on one of them, and you’ll be in the trace inspection utility.

For any pre-webmachine-1.0 users, getting access to this utility requires converting your resources to the new webmachine with referential transparency, but I can tell you from experience that that process is largely mechanical, and not that time consuming. I translated BeerRiot’s ~7000loc in about 2 hours (including testing).

I’d love to hear feedback about the trace tool. It’s the product of about three days of hacking (one proof-of-concept, one nearly-complete-rewrite, one actual-improvement), so I’m not hard-set on much of anything.

Doing it Live

We have a phrase around the office: “Do it live!” It comes from the incredible freakout of Bill O’Reilly. We use it to mean something along the lines of, “This is a startup. The plan might change at any time. Changes go to production when we need them to, and we roll with bugs as best we can.” Far from encouraging careless, fickle choices, it’s a reminder that the camera is on, we’re live, and we are actively developing a product that is under close scrutiny.

Luckily, we have the power of Erlang behind us. The dynamic nature of the language and runtime is a fantastic fit for an environment in which things may change at a moment’s notice.

Erlang’s dynamic nature also came in useful for me on BeerRiot last night. I’ve blogged about hot code loading before, but last night I dipped into the world of OTP applications and Mnesia.

I realized late yesterday afternoon that I had left the login code in a state where usernames were case-sensitive. People could have signed up as “Bryan” and “BRYAN”, even though I already owned the login “bryan”. Basically, I was lazy; the username lookup code was roughly:

%% Name is the test username as read out of the http request
mnesia:transaction(
    fun() ->
        mnesia:match_object(#person{name=Name, _='_'})
    end).

What I needed to do was downcase both the test name and the stored name, and compare those results. I could have just tossed in a call to string:to_lower and reloaded the login module, except that I’m trying to support UTF-8 everywhere. To downcase a UTF-8 string, I needed another library (because I’m not going to both implementing my own).

Google pointed me in the direction of Starling. Despite the strange build process[1], starling provides an Erlang interface to the ICU libraries, to enable unicode manipulations. A quick build and test, and we have

LowerName = ustring:downcase(ustring:new(Name))

Toss an application:start(starling) in the BeerRiot startup code, and everything’s set to go … but why would I want to restart the webserver? Restarting is lame – we’re doing it live!

Instead of restarting, we’ll connect to the webserver through an erl shell (see my earlier hot code loading post about doing this) and modify the running system. We just need two simple commands to get this done.

1> code:add_paths(["/path/to/starling/ebin"]).
ok
2> application:start(starling).

Command 1 tells Erlang to add a path to its library loading search. Command 2 starts the starling application. Starling is now up and running, and we can ustring:downcase/1 as much as we want.

But, I really don’t want to downcase every stored username every time. It’s also kind of nice for people’s usernames to display as they typed them, but not require the same capitalization in their login form. So, I’ll need to store the downcased version somewhere, in addition to keeping the original. I could put it in a new table, mapping back to the persons table, but it’s person data – let’s keep it with the person.

I need to add a field to my person record. But if I do that, all of my code looking for a person record of the current format will break. I need to update all of my person records in storage as soon as I load the code with the modified person record definition.

Mnesia gives us just the tool for this: mnesia:transform_table/3. All we have to do is provide a function that knows how to translate an old person record into a new one. Something like this will do:

%% old def: -record(person, {id, name}).
%% new def: -record(person, {id, name, login}).
add_login() ->
    mnesia:transform_table(
        person,
        fun({person, Id, Name}) ->
            {person, Id, Name, ustring:downcase(ustring:new(Name))}
        end,
        record_info(fields, person).

Stick that code in the person module, where the person record is defined. Now, connect back to the webserver and simply:

3> l(person).
{module, person}
4> person:add_login().

There’s a short period of time in there, between the ends of commands 3 and 4 where any code that looks up a person record will break. But, it’s short, and the entire rest of the site will continue functioning flawlessly.

And that’s the amazing power of Erlang. A very brief, very limited hiccup, and new functionality is deployed. Assuming the appropriate code was put in place to start everything up on restart, the system will come up in exactly the state you want it if the server should ever reboot.

Now back to tinkering… :)

[1]I oughta ‘make‘ you ‘rake’ my lawn, which you’re on, by the way, sonny.

Redesign

A year in the making, almost completely rewritten, I can’t bear to hold it back any longer: today I release the new BeerRiot. Here’s a synopsis of the changes for you:

Old Tech New Tech
Erlyweb (Yaws) Webmachine (Mochiweb)
Erlydb + MySQL Hand-coded models + Mnesia + Apache Solr
ErlTL jQuery

I’ll probably write a blog post about each of those rows sometime in the near future. It should be said though, the my motivation in this rewrite was not to abandon Erlyweb. Rather each piece was a deliberate attempt to get practice on something we were using at work. Erlyweb’s great, but Webmachine has a different feel. MySQL can store data fine, but it’s quite different from the key-value store I hack against all day. ErlTL’s pretty nice as templating languages go, but I needed more DOM experience.

Luckily, the new technologies are also very nice. Webmachine forces you to become more familiar with the ins and outs of HTTP, but after writing a few resources, it becomes natural and quick to create new ones. Storing data in Erlang format in Mnesia is heavenly, and Solr has drastically improved search functionality. jQuery makes JavaScript in the browser far less painful.

In doing this rewrite, it’s been a real eye-opener to dig back through my early Erlang code. It wasn’t terrible, but having worked with other serious Erlang hackers all year, I notice the difference in the code I write now. The site should be much more stable now – Local/maps may even stay up for more than an hour. ;)

In that vein, though, I request that you not judge the JS running the site too harshly just yet. Just like my early Erlang was ugly, I can now tell that my early JS was ugly as well. That will be getting some cleanup soon, but I just couldn’t stand delaying the release for it.

So, go poke it and let me know what you think!

WWW is back

Unfortunately, I only found out this morning that some of you probably thought I had given up on BeerRiot. A month or so ago, I started hosting another domain on the same server, and thought I’d set up Yaws to handle that properly. Unfortunately, I botched it bad enough that while beerriot.com still worked, http://www.beerriot.com didn’t. Sorry!

Anyway, things are fixed now, so if you like prepending “www.”, welcome back!

You may have also questioned the lack of new feature announcements of late. The reason behind this is a giant rewrite I’ve been working on all year. More details in a few weeks, but it’s almost an entire redesign that touches every level. At the same time, work has been crazy busy all year. I’m excited to share the new stuff with you all, though, so keep me in your RSS feed, so you don’t miss the announcement.

Denormalization, Processes

If you read the news, you’ll know that tuneups are happening behind the scenes of BeerRiot. If you came to this blog after reading that story, you’re wondering what, exactly, they are.

If I’m not feeling particularly communication-challenged, I’ll be able to explain them to you. ;)

The first tuneup is one every webmaster has heard of: denormalization. I had been using a view to select data from three tables with one call. The performance drag of that query was serious enough, though, that I’ve decided to complicate things a bit and copy the extra bits of data I need from the other tables into the main one for the query.

The speed gain is great, and, somewhat strangely, the denormalization actually cleaned up a bunch of my code. ErlyDB lacks a “one-to-one” relation, so it was impossible for me to say “each record in this view is really just a record in this other table with some extra data.” That made for a bit of hackery swinging from one type to another. Without that extra table, I think the code reads more clearly.

(Disclaimer: I’m far from being an relational database master, so it’s likely that there is a much better way to express everything I’m doing. But, I’m happy to be making what seems to be forward progress.)

The other main change is more Erlang-centric. Until now, I had been tracking sessions using a customization of the Yaws recommended session server. This is basically a central process that stores opaque data associated with an id string. Whenever your app gets a request, it pulls the cookie value out and checks with this central process to find out if there is any opaque data associated with this key. It works (quite well, in fact), but it seems like a bit of a bottle neck.

So, I’ve decided that there’s a more Erlangy way to do things. What BeerRiot is doing now is starting up a new process for each session, and saving that process id in a client cookie. Then, whenever a request comes in, if it has a cookie with a PID, we can try to contact that session’s handling process directly. No central service required.

It turns out that there’s loads of benefits to having this session hanging around beyond relieving the central service bottleneck. It can cache data, smartly (i.e. listen for updates, etc.). It’s a natural place to run background processes (like propagating live changes to durable storage). I see other potential uses, but since I haven’t tested them yet, I’ll hold my tongue to avoid getting too many hopes up. ;)

For Facebook developers: This process-session system wasn’t possible until just a few weeks ago, when Facebook started supporting cookies on the canvas page. Unfortunately, they only support them for canvas requests, and not for their “mock ajax.” For mock ajax, I’ve decided to just encode the cookie values in post parameters. It works (and it’s no more inconsistent than the rest of the Facebook Developer experience).

Update 2.Jan 18:52 EDT: If you spent any part of today poking at BeerRiot to see how the speed-ups turned out, you were probably rather dissatisfied. I just figured out that I didn’t fully rollout the update. :P It’s there now, and I think you’ll be much more impressed.

Erlang2Facebook Updates

I’ve just committed a couple of minor updates to the erlang2facebook library that I’m sure some of you are interested in.

The first (SVN revisions 7 & 9) is an API update to follow the Facebook team’s changes to profile_setFBML. Now, instead of just passing a single chunk of FBML, containing markup for the profile box, profile actions, and mobile profile, there are three distinct fields to shove those chunks in. Sorry about the non-consecutive SVN commits. :P

The second update (SVN revision 8 ) is intended to show how to use ErlTL better (thanks for the tips, Yariv!). I’ve created render.et, and moved all of the render_* functions from canvas_controller into it. This allows me to use the more HTML-like syntax (code efficiency), while also taking advantage of ErlTL’s automatic use of Erlang’s binaries (runtime efficiency).

Follow

Get every new post delivered to your Inbox.