Archive for the ‘Webmachine’ Category

Simple Webmachine Extension (2/4): Authorization

This is the second post in a four-part series about extending a simple Webmachine resource. The first part discussed adding support for the HTTP method PUT.


Something about modification of server state screams, “Password protection!” at me. Let’s guard the PUT method with Basic auth:


-define(AUTH_HEAD, "Basic realm=MyOSEnv").

is_authorized(RD, Ctx) ->
    case wrq:method(RD) of
        'PUT' -> basic_auth(RD, Ctx);
        _     -> {true, RD, Ctx}

basic_auth(RD, Ctx) ->
    case wrq:get_req_header("Authorization", RD) of
        "Basic "++Base64 ->
            case string:tokens(base64:mime_decode_to_string(Base64), ":") of
                ["webmachine", "rules"] -> {true, RD, Ctx};
                _                       -> {?AUTH_HEAD, RD, Ctx}
        _ -> {?AUTH_HEAD, RD, Ctx}

Arbitrary decisions:

  • Only PUT is protected. If GET and HEAD should be protected as well, just replace the body of is_authorized/2 with the body of basic_auth/2

I need to update my curl command if I don’t want to be told I’m unauthorized:

$ curl -u webmachine:rules -X PUT -H "Content-type: application/json" \ 
   http://localhost:8000/_env/MY_VAR -d "\"yay\""

Come back tomorrow for part three, where I add a modicum of atomicity.

Update: part three is up.

Simple Webmachine Extension (1/4): PUT

I was in need of a break last night after flailing in the face of a new library-language-build-system for a few hours. So, I decided to hack some Webmachine (it was too late to sit down at the trap set).

I was thinking about how the os-environment resource from my last post could be extended. This post begins a four-part series in which new capabilities are added to env_resource.erl.


Let’s start with modification. Why not allow setting those variables?

I need to do three things: announce that the resource supports PUT, list what types of data it accepts, and then actually handle decoding the incoming data. Fifteen quick lines of code should handle it:

-export([allowed_methods/2, content_types_accepted/2, from_json/2]).

allowed_methods(RD, Ctx) ->
    {['GET', 'HEAD', 'PUT'], RD, Ctx}.

content_types_accepted(RD, Ctx) ->
    {[{"application/json", from_json}], RD, Ctx}.

from_json(RD, Ctx) ->
    case wrq:path_info(env, RD) of
        undefined ->
            {struct, MJ} = mochijson:decode(wrq:req_body(RD)),
            [ os:putenv(K, V) || {K, V} <- MJ ];
        Env ->
            MJ = mochijson:decode(wrq:req_body(RD)),
            os:putenv(Env, MJ)
    {true, RD, Ctx}.

Arbitrary decisions were made above:

  • Clients shall send JSON-encoded data. I could have just as easily added another element to the return value of content_types_accepted/2 and written another method to handle it (e.g. {"text/plain", from_text} and from_text/2). Accepting JSON is nice, since it makes PUT and GET simply symmetric.
  • /_env expects a JSON structure, while /_env/VARIABLE expects a string. Again with the simple symmetry between PUT and GET.
  • /_env only modifies or creates the variables described in the JSON structure it receives. Alternatively, it could have cleared out any unnamed environment variables, but this seemed unnecessary.
  • No body is returned in a successful response. It would have been fairly simple to generate the same body that would have been returned in a GET, then use wrq:append_to_response_body/2 to return a modified RD, but this also seemed unnecessary.

I can now set MY_VAR to "hello", using two different curl commands:

$ curl -X PUT -H "Content-type: application/json" \
   http://localhost:8000/_env -d "{\"MY_VAR\":\"hello\"}"

$ curl -X PUT -H "Content-type: application/json" \
   http://localhost:8000/_env/MY_VAR -d "\"hello\""

Come back tomorrow for part two, in which I’ll add authorization via username and password.
Update: part two is up.

Simple Webmachine – Proper HTTP Resources

Update: This post sparked my Webmachine nerve in such a way that I wrote four more posts about extending the resource described below. Read them if you’d like to see how this resource can evolve.

This morning I read a post about CouchDB’s HTTP Handlers. Jón Grétar Borgþórsson demonstrates how one might implement a handler that serves a JSON structure describing the OS environment variables.

I thought that two additional pieces of code might be interesting. I’ll lead with an example showing the most likely way this resource would have been coded for Webmachine:

%% dispatch:
%% {["_env"],      env_resource, []}.
%% {["_env", env], env_resource, []}.

-export([init/1, content_types_provided/2, resource_exists/2, to_json/2]).

init(_) -> {ok, undefined}.

content_types_provided(RD, Ctx) ->
    {[{"application/json", to_json}], RD, Ctx}.

resource_exists(RD, Ctx) ->
    case wrq:path_info(env, RD) of
        undefined ->
            Result = [ list_to_tuple(string:tokens(E, "="))
                       || E <- os:getenv() ],
            {true, RD, {struct, Result}};
        Env ->
            case os:getenv(Env) of
                false  -> {false, RD, Ctx};
                Result -> {true, RD, Result}

to_json(RD, Result) ->
    {mochijson:encode(Result), RD, Result}.

The biggest difference to note between the Webmachine version and the CouchDB version is that the proper HTTP status codes are returned from the Webmachine resource. The CouchDB handler returns 500 or 405 when the requested resource is not found. The proper status for this case is 404. Webmachine knows this, and handles the choice automatically.

A little extra emphasis, I think, is appropriate here: Webmachine chooses the proper response code for you. You define methods that describe the state of your resource (like whether or not it exists, what methods it allows, etc.), and Webmachine negotiates the muck of HTTP.

As a bonus, the Webmachine resource is the same length, while at the same time being less dense and more readable.

Let’s not get hasty, though. If there is a really good reason for returning an alternate status code, Webmachine won’t get in your way. To prove it, here’s a Webmachine resource that (as near as I can tell, I’m not a CouchDB guru) returns exactly the same statuses as the CouchDB handler:

%% dispatch:
%% {["_env2", '*'], env_resource, []}.

-export([init/1, content_types_provided/2, resource_exists/2, to_json/2]).

init(_) -> {ok, undefined}.

content_types_provided(RD, Ctx) ->
    {[{"application/json", to_json}], RD, Ctx}.

resource_exists(RD, Ctx) ->
    case string:tokens(wrq:disp_path(RD), "/") of
        [] ->
            Result = [ list_to_tuple(string:tokens(E, "="))
                       || E <- os:getenv() ],
            {true, RD, {struct, Result}};
        [Env] ->
            case os:getenv(Env) of
                false ->
                    {{halt, 500},
                       "Content-type", "application/json",
                           {struct, [{error, "not_found"},
                                     {reason, "Variable Not Found"}]}),
                Result ->
                    {true, RD, Result}
        _ ->
            {{halt, 405},
             wrq:set_resp_header("Allow", "GET,HEAD", RD),

to_json(RD, Result) ->
    {mochijson:encode(Result), RD, Result}.

HTTP Decision Graph Comes to Life

Sometime last year, a flowchart describing the processing of an http request (as rendered by Alan Dean) made its way around the net.


I thought, “Wouldn’t it be cool if your webserver could plot the path it took on that chart in handling a request?”


Well, now it can – if you’re using the latest Webmachine. The above graph is scaled down from a tool I created that is packaged with the just-released Webmachine 1.0.

You can read all about the tool on the Webmachine Debugging page, but the basic idea is: draw an edge between each node in the graph that is traversed during the processing of a request. Clicking on decisions brings up a panel that details what resource module functions were called at that decision (as well as what their parameters and return values were). There are also panels for the details about the request itself (method, path, headers, body) and the response (code, headers, body).

I’ve put up three example traces from BeerRiot’s beer_resource:

  1. The first hits /beer/1, and runs successfully all the way to 200: trace-example-200.html
  2. The second hits /beer/9999, which doesn’t exist, and runs to 404: trace-example-404.html
  3. The third hits /beer/536, which used to exist, but has since been merged with beer 202, so it runs to 301: trace-example-301.html

You’re floored, right? You just can’t wait to get your hands on it, right? </modesty> Well, doing so is easy. Once you have your Webmachine resource written, just change your init/0 function from:

init(Config) ->
   {ok, Config}.


init(Config) ->
   {{trace, "/tmp"}, Config}.

Recompile and reload the module, then issue an HTTP request to it. You should see a file show up in /tmp with the extension .wmtrace.

Now open the Erlang shell for your webmachine application and type:

wmtrace_resource:add_dispatch_rule("wmtrace", "/tmp").

Tab over to your browser and hit /wmtrace/. You should see the list of .wmtrace files in your /tmp directory. Click on one of them, and you’ll be in the trace inspection utility.

For any pre-webmachine-1.0 users, getting access to this utility requires converting your resources to the new webmachine with referential transparency, but I can tell you from experience that that process is largely mechanical, and not that time consuming. I translated BeerRiot’s ~7000loc in about 2 hours (including testing).

I’d love to hear feedback about the trace tool. It’s the product of about three days of hacking (one proof-of-concept, one nearly-complete-rewrite, one actual-improvement), so I’m not hard-set on much of anything.

Webmachine Released!

One of the cool technologies I mentioned in my last post has just been released open-source. Webmachine is a nice framework for creating web-friendly resources. We use it as the engine for serving our dynamic web content. For a bit more color to the description, read Justin’s post.

I like it so much I may even attempt a BeerRiot port at some point…