Beer IoT (Part 4)

This is part four of a series on monitoring homebrew fermentation. In parts one, two, and three, I experimented with data I downloaded from one platform and uploaded to another. In this part, I create some new sensors to try.

I have hardware!

img_2151
Helium Atom connected to an ADXL345

And it’s pretty slick. Using any I2C device with Helium’s wrappers is some of the easiest hardware hacking I’ve ever done. This is my first time using Lua, but while it made some different choices than other common languages, it has been very easy to learn.

Maybe an example will prove my point. This is how you take a reading from an ADXL345 accelerometer (he is Helium’s built-in library):

While building such a script, you can fill it with print statements and run it whenever you like by connecting the Atom to your computer via USB cable. This all makes it super easy to learn how a new sensor works.

When you’ve acquired the measurements you want to save, you send them to Helium’s cloud platform like this:

Once you have posted data, you can use Helium’s dashboard to check it out:

helium-dashboard-1

This system is so smooth that in just a week (of evenings) I’ve been able to write scripts to take readings from two different sensors. Those sensors are now sitting in the bottoms of two carboys monitoring the fermentation of an English Mild. Yes, the first thing I did with my new electronics was submerge them in infected sugar water. I tested the water tightness of their containers … oh, at least several times.

helium-submerged
Foreground: carboy with “tilt” sensor, carboy with “sink” sensor, carboy with BeerBug; Background: Helium Atoms in red container, airlock/blowoff in green container

What monitoring a fermentation amounts to is measuring the density of the liquid. Water with sugar in it is denser than water on its own, or water with alcohol in it. As the yeast convert the sugar to alcohol, the liquid becomes less dense.

Most tools test the density of the liquid indirectly, by instead testing the buoyancy of a known float. The standard hydrometer is a float with a scale attached, so you can read how high it’s floating by looking at it.

The device I’m looking to replace, the BeerBug, reads this float-height by suspending the float from a flexible metal tongue, which is also connected to a magnet, whose position is read by a hall-effect sensor. As the float floats higher, the magnet nears the sensor, producing a stronger reading. It requires that you measure the gravity of your liquid with a hydrometer first, but once the initial reading is calibrated, the change in buoyancy can be measured (the magnet moves farther from the sensor as the beer ferments).

screen-shot-2017-02-20-at-1-22-45-pm
BeerBug operation – left: pre-ferment, right: post-ferment

I wasn’t able to obtain a hall-effect sensor as quickly as I wanted, so my devices take different approaches. The first is based on someone else’s design. By making the float very buoyant on one end, and just barely not able to float on pure water on the other, the angle at which the float floats will change with the density of the liquid. So the float should start close to horizontal when the unfermented beer is very sugary, and end up more steeply tilted as the sugar is converted to alcohol. The sensor in this float is thus the ADXL345 accelerometer that the above code demonstrates using. By measuring the direction of the force of gravity, we can figure out what angle the sensor is floating at.

screen-shot-2017-02-20-at-1-30-20-pm
Tilt operation – left: pre-ferment, right: post-ferment

The idea behind the second experimental sensor is to directly measure the increased pressure from the denser liquid, instead of measuring its effect on buoyancy. I’ve place an atmospheric presure sensor in a non-rigid housing, which should allow the liquid to squeeze the air around the sensor, raising the pressure around it. As the liquid becomes less dense, the pressure should reduce. The sensor has been placed at the bottom of the carboy, to get as much liquid above it to provide pressure as possible. I’m also taking readings from the pressure sensor on the Atom, which is sitting in the open air outside the carboy, so I can compensate for weather-related pressure changes.

screen-shot-2017-02-20-at-1-45-21-pm
Sink operation (percent pressure as compared with pure water): left: pre-ferment specific gravity of 1.040; right: post-ferment sg of 1.010

So far, I’m just collecting raw data: pressure readings in the latter case, and force readings in the former. It’s going to take some analysis to figure out what they mean. Unfortunately, the BeerBug site is currently only serving the most recent reading, and not history, so direct comparison of data will not be possible for now. The Helium site is running smoothly, though – and in addition to their dashboard, as shown above, I can also use the graphing code from my earlier experiments:

I’ve shared the code I’m using for these experiments on Github. Please feel free to download and use the code yourself, or to suggest ways I can improve my Lua! Check back soon for analysis of how the measurement and fermentation went.

Update: the first bit of analysis, from the temperature sensors is up in part five.

Beer IoT (Part 3)

My code is ugly, but it works, so it’s time to post part three of this series. In part one, I downloaded data captured by my BeerBug. In part two, I uploaded it to the Helium platform. In this entry, I’ll read use Helium’s API to query and graph the data.

If I were dealing with a currently-active data source, Helium’s dashboard would allow me to view what was happening. That is a fantastic resource for developers, because it takes one step of uncertainty out of the equation by allowing inspection in the middle of the pipeline. But, “currently-active” is limited to 90 days in the dashboard, and my data is about a year old, so I need something else.

What I have built are a few simple D3 graphs:

beerbug-on-helium-screenshot

Each graphs the average value for a time slice as a dark line, with a lighter band around it marking the range from minimum to maximum. It’s crude, but it gets the point across. You can move earlier and later in the range by dragging left and right. Zoom in by holding shift while dragging to select a region. Zoom out by holding alt while dragging to select a region.

As I said before, it’s ugly, but I’ve put the code in a gist, if you’re looking for examples to follow (it’s neither well-organized nor well-documented, but if you’re also working with the Helium API, you may pick up on a clue of what you’re looking for).

Some things that made this graphing easy:

  • Helium supports CORS, so I didn’t even have to set up a proxy webservice. Loading graph.html from a file:// URL still allowed me to make requests to Helium to for the data.
  • D3 has a wide variety of basic example graphs. What I started with was a basic mash-up of the Line Chart and Bitvariate Area Chart examples.
  • Helium’s API will give you the latest data for your sensor (note: no 90-day window here), if you don’t provide an end filter, and also include a “previous” link in the response to get the next-latest data.

Some things that made this graphing hard (or at least tricky):

  • D3 defaults to local time, but Helium is all in UTC. Forgetting to translate leads to confusing debugging about why offset calculations are wrong.
  • Helium’s API will always give you the latest data for your sensor, if you don’t provide an end filter. That is, you can really only follow “previous” links backward through time. Once you follow a “previous” link, you’ll get a “next” link, but you should already have the data that link would give you. You can’t begin with a start filter and expect to follow “next” links to the latest data.

I’m posting this simple viewer now instead of waiting until I’ve had time to clean it up more, because the next step is probably a rewrite. As expected, Helium’s API works really well for supporting a simple dashboard: if you’re concerned with recent updates, and then scrolling back in time from there, the API makes it easy. But, what I learned during a Helium presentation at a meetup this week is that the real purpose of this API is to allow Helium’s servers to act as a transport between your sensors and your own servers. The expectation is that you’ll grab data from Helium, store it in your own database, and serve your app from your own storage.

Helium-as-transport is an interesting bet. It’s focusing on exactly the problem I’ve had with my BeerBug: I have to rely on their site for the tool to be useful. If Helium can keep the path from device to my analysis up more reliably, they will succeed in their goal of making sensor IoT more available to people that want to focus on the sensing and the analysis, whtout worrying about the infrastructure in between (i.e. bascially everyone).

Update: Part 4 is up – hardware on display!

FSMs Make Instrumentation Easy

This piece originally appeared on the Honeycomb.io blog as part of a series on instrumentation.

There is a way to structure programs that makes inclusion of instrumentation straightforward and automatic, and it’s one that every hardware and software engineer should be completely familiar with: finite state machines. You have seen them time and again as illustration of how a system works:

turnstile_state_machine_colored
Turnstile state machine

What makes FSM instrumentation straightforward is that the place to expose information is obvious: along the edges, when the state of the system is changing. What makes it automatic is that some generic actor is usually driving a host of specific FSMs. You only need to instrument the actor (“entering state Q with message P”, “leaving state S with result R”), and every FSM it runs will be instrumented for free.

I learned how easy FSMs are to instrument while working on Webmachine, the webserver that is known for implementing the “HTTP Flowchart”.

http-graph-small-unmarked
HTTP Diagram

Each Webmachine resource (a module handling a request) is composed of a set of decision functions. The functions are named for the points in the flowchart where decisions have to be made about which branch to follow. This is just alternate terminology, though: the flowchart and resource describe an FSM, in which the decision points (and terminals) are states.

Driving the execution of a Webmachine resource is a module called webmachine_decision_core. This is where the logic lives for which function to call, and which branch to take based on the result. It triggers each function evaluation by calling a generic webmachine_resource:resource_call function, with the name of the decision.

resource_call(F, ReqData,
              #wm_resource{
                 module=R_Mod,
                 modstate=R_ModState,
                 trace=R_Trace
                }) ->
    case R_Trace of
        false -> nop;
        _ -> log_call(R_Trace, attempt, R_Mod, F, [ReqData, R_ModState])
    end,
    Result = try
        apply(R_Mod, F, [ReqData, R_ModState])
    catch C:R ->
            Reason = {C, R, trim_trace(erlang:get_stacktrace())},
            {{error, Reason}, ReqData, R_ModState}
    end,
    case R_Trace of
        false -> nop;
        _ -> log_call(R_Trace, result, R_Mod, F, Result)
    end,
    Result.

This is where the ease of instrumenting an FSM is obvious. The entirety of the hooks needed to support tracing and visual debugging of every Webmachine resource are those two log_call lines. They record the entrance and exit of each state of the FSM without requiring any code to complicate the implementation of the resource module itself. For example, a simple resource:

-module(blogapp_resource).
-export([
    init/1,
    content_types_provided/2,
    to_html/2
]).

-include_lib("webmachine/include/webmachine.hrl").

init([]) ->
    {{trace, "/tmp"}, undefined}.

content_types_provided(ReqData, State) ->
    {[{"text/html", to_html}], ReqData, State}.

to_html(ReqData, State) ->
    {"<html><body>Hello, new world</body></html>", ReqData, State}.

This resource does no logging of its own (as you can see), but for each request it receives, a file is created in /tmp that can be rendered with the Webmachine visual debugger. For example, the processing for a request that specifies Accept: text/html looks like this (live example):

heavy-happy-path

It’s easy to see that the request made it all the way to the 200 OK result at grid location N18. Along the way, it passed through many decisions where the default behavior was chosen (grey-outlined diamonds), and a few where the resource’s own implementation was called (purple-outlined diamonds). Clicking on any decision will display more information about what happened there.

In contrast, the processing for a request that specifies Accept: application/json looks like this (live example):

heavy-error-path

Now it’s easy to see that the request stopped at the 406 Not Acceptable result at grid location C7 instead. For no more code than specifying where to put the log output, we’ve gotten the complete story of how each request was handled. In case you prefer the original text to this visual styling, I’ve also archived the raw trace files.

This sort of regular, simple instrumentation may seem naive, but the regularity and simplicity offer some benefits. For example, all of the instrumentation points have obvious names: they are the same as the states of the FSM. This alone continues to help beginners bootstrap their understanding of Webmachine. When they’re confused about why something happened, they can go straight to the trace or debugger, and either search for the name of the decision they expected to turn differently, or find the name of the decision that did go differently, and know exactly where to return to in their code. Resource implementors add no code, but get well-labeled tracing for free.

Finite state machines can be found under many other names: flowcharts, chains, pipelines, decision trees, and more. Any staged-processing workflow benefits from a basic “stage X began work W”, “stage X finished work W”, which is completely independent of what the stage is doing, and is equivalent to the stage entering and exiting the “working” state. See Hadoop’s job statistics for an example: generically generated start/stop information that an operator can use to get a basic idea of progress without needing the job implementor to add their own instrumentation. I sometimes even consider the basic request/response logging of multi-service systems as a form of this: sending a request is equivalent to entering a waiting state, etc.

To speak more broadly, the important points to instrument are those when application state is changing. This is how I track down where a process diverged from its expected path, or how long it took to make the change. Finite state machines help by making those points more obvious. Instrumenting state transitions reduces the burden on the implementor, by naturally answering the question of where instrumentation belongs and what it’s called. It also reduces the burden on the user of learning what the implementor decided. Inspection of the system becomes easier because the state transitions are always instrumented, and instrumented in a way that maps directly to the system’s operation.

Thanks to Julia and Charity for organizing the instrumentation series.

Beer IoT (Part 2)

Welcome back for part two. In part one, I explained how I exported my historical brewing data from The BeerBug’s website. In this part, I’m going to demonstrate what I’ve learned about one alternative, the Helium platform.

Helium doesn’t sell a homebrew device, but rather a generic sensor platform. I ordered a dev kit while they were on sale, and while I’m waiting for my hardware to arrive, I have gained access to their data aggregation platform.

Disclaimer: I know several of the Helium developers, but I am not being compensated in any way to review their system.

Helium supports creating “virtual sensors” and uploading whatever data you like for them, as a way to test and experiment. What better data to play with than something I’m already familiar with? I’ll upload the BeerBug data I exported.

When a helium sensor posts a reading, it specifies a “port” for that reading. The port is primarily a label of what the reading is, but the examples given and port names reserved suggest that they’re intended to label the “type” of the reading. For example, port “t” is reserved for temperature in Celcius, and port “b” is battery level in millivolts. I have data for each of those, as well as a port I’m going to call “sg” for specific gravity.

Logging a reading is done by HTTP-POSTing some JSON data. The basic form looks like this:

{
 "data": {
   "attributes": {
     "port": "sg", // the name of the port
     "value": 1.0568, // the value for the reading
     "timestamp": "2016-01-23T18:35:03Z" // ISO8601 time in UTC
   },
   "type": "data-point"
 }
}

My data is all floating point numbers, so nothing too complex to worry about … except it’s all in the wrong format. To start with, my data looks like this:

{
 "dates": [ // comma-separated, zero-based month index, in local time
   "2016,0,23,18,35,3",
   // ... the rest of the dates ...
 ],
 "temp": [ // fahrenheit degrees
   70.26
   // ... the rest of the temperatures ...
 ],
 "sg": [ // specific gravity
   1.0568
   // ... the rest of the specific gravities ...
 ]
}

After many iterations, this is my jq script for conversion:

[.dates, .sg, .temp, .batt] | transpose | .[] |

  # there is probably a better way to convert from 0-based month to ISO8601
  # strptime bails on 0-based month, but produces a 0-based month structure?
  (.[0] | split(",") |
   [.[0],(.[1] | tonumber | .+1 | tostring),.[2],.[3],.[4],.[5]] |
   join(",") | strptime("%Y,%m,%d,%k,%M,%S") | todate) as $date |

  # specific gravity
  {"data":{"attributes":{"port":"sg","value":.[1],"timestamp":$date},
           "type":"data-point"}},

  # temperature - assumed fahrenheit (helium is celcius)
  {"data":{"attributes":{"port":"t","value":((.[2] - 32) * 5 / 9),"timestamp":$date},
           "type":"data-point"}},

  # battery level - assumed volts (helium is millivolts)
  {"data":{"attributes":{"port":"b","value":(.[3] * 1000),"timestamp":$date},
           "type":"data-point"}}

It has one major bug still: I’m just using local time as UTC. Just figuring out how to deal with the zero-based month was enough hassle (strptime produces an array that uses a zero-based month, but it can’t consume a string with one). It seems like the addition of a mktime | . + 28800 | gmtime (or 25200) would be close enough … but I should have exported in UTC to start with.

But anyway, let’s run this through jq:

$ jq -cf beerbug-to-helium.jq export-oatmeal-stout-jan-2016.json &gt; helium-oatmeal-stout-jan-2016.json
$ head -3 helium-oatmeal-stout-jan-2016.json
{"data":{"attributes":{"port":"sg","value":1.0568,"timestamp":"2016-01-23T18:35:03Z"},"type":"data-point"}}
{"data":{"attributes":{"port":"t","value":21.255555555555556,"timestamp":"2016-01-23T18:35:03Z"},"type":"data-point"}}
{"data":{"attributes":{"port":"b","value":4146.7,"timestamp":"2016-01-23T18:35:03Z"},"type":"data-point"}}

Now I have one data-point per line, which will make uploading easy. But before uploading, I need to actually create my virtual sensor. This can be done via Helium’s HTTP API, but their example is missing the POST body (though I assume it’s the same as the update’s body, without the “id” field), and it’s just so simple with the Helium Commander utility installed (yes, I’ve censored the UUID):

$ helium sensor create --name beerbug-536
$ helium --uuid sensor list
+--------------------------------------+-----+------+-----------------------------+----------------------------+-------------+
| ID                                   | MAC | TYPE | CREATED                     | SEEN                       | NAME        |
+--------------------------------------+-----+------+-----------------------------+----------------------------+-------------+
| ABIGUUID-USED-TOBE-HERE-BUTISGONENOW |     |      | 2016-12-18T06:11:54.182691Z | 2016-12-19T04:49:57.00331Z | beerbug-536 |
+--------------------------------------+-----+------+-----------------------------+----------------------------+-------------+
$ export HELIUM_BEERBUG=ABIGUUID-USED-TOBE-HERE-BUTISGONENOW

Now I can finally upload some data! I’m just going to pipe the file I have through xargs and let things chug along. The sed work at the front is needed to escape the double-quotation marks in the json file, so that xargs doesn’t remove them:

$ sed 's/"/\\"/g' helium-oatmeal-stout-jan-2016.json |\
  xargs -n 1 curl -H "Content-Type: application/json" \
  -H "Authorization: $HELIUM_API_KEY" -XPOST \
  "https://api.helium.com/v1/sensor/$HELIUM_BEERBUG/timeseries" -d

That … was slow. About 12,000 data-points in an hour. Or, three per second, as some insist all speeds be measured. I have around 65,000 data points, so that would be five hours or more. That’s my fault, though – starting curl all the way over again for each data point is way expensive. Let’s split up the work and run three curls in parallel:

$ tail +12001 helium-oatmeal-stout-jan-2016.json |\
  grep "\"b\"" > helium-oatmeal-stout-jan-2016.json-b
$ tail +12001 helium-oatmeal-stout-jan-2016.json |\
  grep "\"sg\"" > helium-oatmeal-stout-jan-2016.json-sg
$ tail +12001 helium-oatmeal-stout-jan-2016.json |\
  grep "\"t\"" > helium-oatmeal-stout-jan-2016.json-t
$ sed 's/"/\\"/g' helium-oatmeal-stout-jan-2016.json-b |\
  xargs -n 1 curl -H "Content-Type: application/json" \
  -H "Authorization: $HELIUM_API_KEY" -XPOST \
  "https://api.helium.com/v1/sensor/$HELIUM_BEERBUG/timeseries" -d &amp;
$ sed 's/"/\\"/g' helium-oatmeal-stout-jan-2016.json-sg |\
  xargs -n 1 curl -H "Content-Type: application/json" \
  -H "Authorization: $HELIUM_API_KEY" -XPOST \
  "https://api.helium.com/v1/sensor/$HELIUM_BEERBUG/timeseries" -d &amp;
$ sed 's/"/\\"/g' helium-oatmeal-stout-jan-2016.json-t |\
  xargs -n 1 curl -H "Content-Type: application/json" \
  -H "Authorization: $HELIUM_API_KEY" -XPOST \
  "https://api.helium.com/v1/sensor/$HELIUM_BEERBUG/timeseries" -d

That was better, at about 8-ish points per second. I don’t expect much better out of my non-business DSL line. It’s saturated enough that MARIO RUN is delaying the starts of the games that I’m playing while waiting. If I were planning to bulk-load other data, I’d write something that kept the HTTP connection open and pipelined POSTs.

The real question I’ve been waiting on is, now that the data is in Helium’s system, what can I do with it? The bummer news is that I can’t use their web dashboard. It only goes back 90 days, and this data is from nearly a year ago. Maybe I’ll adjust the dates in another experiment. I think the only way to change data later might be to make a new sensor (i.e. you don’t get to change it – you have to rewrite it), so maybe best to think about where you scribble.

But, I can do basic retrieval, with filter[start]= and filter[end]=:

$ curl -H "Authorization: $HELIUM_API_KEY" -XGET \
  "https://api.helium.com/v1/sensor/$HELIUM_BEERBUG/timeseries?filter%5Bstart%5D=2016-02-01T12:00:00Z&amp;filter%5Bend%5D=2016-02-01T12:05:00Z" |\
  jq .
{
 "data": [
   {
    "attributes": {
      "value": 4162.5,
      "timestamp": "2016-02-01T12:04:01Z",
      "port": "b"
    },
    "relationships": {
      "sensor": {
        "data": {
          "id": "8dce390e-082a-47fc-85cf-43adafd30edd",
          "type": "sensor"
        }
      }
    },
    "id": "89b47b2f-500d-4af3-9d01-49766b5938b0",
    "meta": {
      "created": "2016-12-23T06:05:50.757111Z"
    },
    "type": "data-point"
   },
   {
    "attributes": {
      "value": 1.0131,
      "timestamp": "2016-02-01T12:04:01Z",
      "port": "sg"
    },
    "relationships": {
      "sensor": {
        "data": {
          "id": "8dce390e-082a-47fc-85cf-43adafd30edd",
          "type": "sensor"
        }
      }
    },
    "id": "645ca2f8-96aa-4cd9-915d-3670ec1b43af",
    "meta": {
      "created": "2016-12-23T06:06:21.478522Z"
    },
    "type": "data-point"
   },
   {
    "attributes": {
      "value": 18.672222222222224,
      "timestamp": "2016-02-01T12:04:01Z",
      "port": "t"
    },
    "relationships": {
      "sensor": {
        "data": {
        "id": "8dce390e-082a-47fc-85cf-43adafd30edd",
        "type": "sensor"
      }
    }
   },
   "id": "44afd122-b13d-4675-b35a-e48184f32c9a",
   "meta": {
     "created": "2016-12-23T06:06:38.950493Z"
   },
   "type": "data-point"
  },
...

I’ve elided the data points at 12:03:01, 12:02:01, and 12:01:01 for brevity. This is a bit verbose, and seems to contain a lot of duplicate information. It all makes more sense when you learn that you query the same data by organziation, element, or label, which each map to groups of sensors.

It’s also possible to request basic aggregate statistics for this data, by adding agg[type]= and agg[size]=. The types currently available are min, max, and avg, and window sizes start at one minute and go up to one day.

$ curl -H "Authorization: $HELIUM_API_KEY" -XGET \
  "https://api.helium.com/v1/sensor/$HELIUM_BEERBUG/timeseries?filter%5Bstart%5D=2016-02-01T12:00:00Z&amp;filter%5Bend%5D=2016-02-01T12:30:00Z&amp;agg%5Btype%5D=avg&amp;agg%5Bsize%5D=10m" |\
  jq .
{
 "data": [
   {
    "attributes": {
      "value": {
        "max": 18.7,
        "avg": 18.6819444444444,
        "min": 18.6555555555556
      },
      "timestamp": "2016-02-01T12:20:00Z",
      "port": "agg(t)"
    },
    "relationships": {
      "sensor": {
        "data": {
          "id": "8dce390e-082a-47fc-85cf-43adafd30edd",
          "type": "sensor"
        }
      }
    },
    "id": "ff308e69-a2c5-43a8-9215-dd4042b51104",
    "meta": {
      "created": "2016-12-23T06:06:46.98618Z"
    },
    "type": "data-point"
   },
   {
    "attributes": {
      "value": {
        "max": 1.0133,
        "avg": 1.01325,
        "min": 1.0132
      },
      "timestamp": "2016-02-01T12:20:00Z",
      "port": "agg(sg)"
    },
    "relationships": {
      "sensor": {
        "data": {
          "id": "8dce390e-082a-47fc-85cf-43adafd30edd",
          "type": "sensor"
        }
      }
    },
    "id": "9d09823b-5302-4fd8-94f4-9c1e2ef62b99",
    "meta": {
      "created": "2016-12-23T06:06:29.719129Z"
    },
    "type": "data-point"
   },
   {
    "attributes": {
      "value": {
        "max": 4168,
        "avg": 4161.15,
        "min": 4152.5
      },
      "timestamp": "2016-02-01T12:20:00Z",
      "port": "agg(b)"
    },
    "relationships": {
      "sensor": {
        "data": {
          "id": "8dce390e-082a-47fc-85cf-43adafd30edd",
          "type": "sensor"
        }
      }
    },
    "id": "5cd24bb5-30ea-4278-bbb0-082c8f25a5fe",
    "meta": {
      "created": "2016-12-23T06:06:01.779172Z"
    },
    "type": "data-point"
   },
...

Again, I’ve elided the results for 12:10 and 12:00 for brevity. This seems like it could be very convenient for supporting something like a dashboard. Some things I haven’t shown are the ability to choose a limited number of ports, and how large result sets are paginated, but those are also quite simple. It seems like the requests to support basic display of min/max/avg data on a zoomable/scrollable timeline would be very straightforward. And, that’s what Helium’s dashboard appears to give you, if your data is recent.

But I need some way to visualize historical data as well. Read part three to find out what I came up with.

Beer IoT (Part 1)

I’m not super into the Internet-of-Things. There are no wifi lightbulbs, electronic locks, or smart thermostats in my house. But, I’m a homebrewer, and that means I love new ways to get data about my beer. I backed The BeerBug on Kickstarter, and I’ve used it on a number of batches since early 2014.

The data my BeerBug provides is simple, but interesting: air temperature and specific gravity, measured once per minute. It gives me a pretty good idea of when a beer has finished or stalled.

The user experience leaves something to be desired, though. The website is clunky, and was down for a month or more recently. The mobile app is just a web view. There is no way to use the device without the website.

So, I have two goals over the next few months. The first is to extract all of the data I have recorded with my BeerBug, and the second is to find an alternative. This post covers the first goal, and the next will begin to explore the second.

The BeerBug offers an API … that only covers active brewing, not history. Beer pages allegedly offer CSV and XML data download, but the links haven’t worked in months. You can view graphs of historical brews on the website, though, so they have the ability to fetch that data.

Pulling up the Chrome web inspector and visiting a beer page, there is an XHR for a “graph.php” that returns JSON to draw the graph. Try as I might, I haven’t been able to construct a curl command to get the same data – it always came through with “0” or “null” in several fields. There’s almost certainly some header I’m missing, but I’ve taken an alternate route.

The network tab of Chrome’s web inspector will let you “Save as HAR with Content.” This exports a JSON file will all the information the inspector is showing. Lucky for me, this includes the content of the graph.php XHR response. So, switching the graph view from “25 points” to “all” and waiting for the new graph.php request to complete, then saving as HAR has captured my data.

The data from the XHR is the last in the log entries, so it’s easy to extract with jq:

$ jq ".log.entries[-1].response.content.text | fromjson" \
  export-oatmeal-stout-jan-2016.har > export-oatmeal-stout-jan-2016.json

Now I can start to explore the data:

$ jq ". | keys" export-oatmeal-stout-jan-2016.json
[
 "al",
 "batt",
 "dates",
 "degrees",
 "ext",
 "plato",
 "platod",
 "sg",
 "success",
 "temp",
 "temp2"
]

Almost all of these fields are arrays with one entry per measurement:

  • al: alcohol percentage
  • batt: battery voltage (volts)
  • dates: date of measurement (comma-separated strings year,month,day,hour,minute,second – not width-padded, zero-based month index, local timezone)
  • platod: degrees plato
  • sg: specific gravity
  • temp: air temperature (either Fahrenheit or Celcius, depending on value of “degrees” field)
  • temp2: probe temperature

Non-array fields:

  • degrees: what units “temp” and “temp2” are in (“F” for Fahrenheit, and I assume “C” for Celcius, but I haven’t checked)
  • ext: unknown
  • plato: unknown
  • success: unknown

Just a bit of data checking: I started the beer on January 23, 2016, and finished it on February 8:

$ jq ".dates[0], .dates[-1]" export-oatmeal-stout-jan-2016.json
"2016,0,23,18,35,3"
"2016,1,08,15,18,3"

Its specific gravity started about where I normally start my beers, and ended a little below where I normally finish them:

$ jq ".sg[0], .sg[-1]" export-oatmeal-stout-jan-2016.json
1.0568
1.0082

That means it may have a 6.4% alcohol content by volume:

$ jq ".al[0], .al[-1]" export-oatmeal-stout-jan-2016.json
0
6.4

And finally, it was kept in nice cool range (`add / length` is jq for “average”):

$ jq ".temp | max, min, add / length" export-oatmeal-stout-jan-2016.json
71.18
63.4
65.68423989795319

Neat. Let’s compare all the beers I exported:

# extract all xhr data
$ for x in export*.har; \
    do jq ".log.entries[-1].response.content.text | fromjson" $x \
    > ${x/har/json}; \
  done
# extract basic data
$ for x in export*.json; \
    do echo $x && jq -c '{"sg":.sg[0],"fg":.sg[-1],"abv":.al[-1],"temp":{"min":.temp|min,"max":.temp|max,"avg":(.temp|add/length)}}' $x; \
  done
export-abbey-oct-2015.json
{"sg":1.0498,"fg":1.4284,"abv":0,"temp":{"min":69.74,"max":79.96,"avg":72.70824454043661}}
export-beechwood-smoke-may-2014.json
{"sg":1.0511,"fg":0.9935,"abv":7.5,"temp":{"min":71.8,"max":83,"avg":75.40845794392524}}
export-butternut-stout-nov-2014.json
{"sg":1.0529,"fg":1.3635,"abv":0,"temp":{"min":65.36,"max":74.41,"avg":69.15657534246593}}
export-ipa-may-2015.json
{"sg":1.0475,"fg":0.9946,"abv":6.7,"temp":{"min":68.81,"max":80.21,"avg":71.19772108108131}}
export-mead.json
{"sg":1.115,"fg":1.0389,"abv":10,"temp":{"min":61,"max":70.84,"avg":65.09618010573946}}
export-oatmeal-stout-jan-2016.json
{"sg":1.0568,"fg":1.0082,"abv":6.4,"temp":{"min":63.4,"max":71.18,"avg":65.68423989795319}}
export-oatmeal-stout-nov-2015.json
{"sg":1.0639,"fg":1.0108,"abv":7,"temp":{"min":63.66,"max":77.25,"avg":69.64541020966313}}
export-oatmeal-stout-sep-2014.json
{"sg":1.0499,"fg":0.9973,"abv":7.3,"temp":{"min":72.3,"max":81.8,"avg":76.59252173913043}}
export-pumpkin-ale-nov-2015.json
{"sg":1.0529,"fg":1.0134,"abv":5.2,"temp":{"min":63.37,"max":70.69,"avg":66.15414939483689}}

There is quite a bit more analysis that should be done on this data. For example, I know that the specific gravity jumps around quite a lot. It is measured by a hall-effect sensor capturing the weight of a plumb in the beer, and so it’s a bit touchy about temperature changes and carbonation bubbles from active yeast. Those simple stats about the temperature (min, max, mean) do not really tell the whole story.

But, I’m fairly well convinced that I now have a copy of my recorded data. What is the path forward? Find out in part two.

Roundtripping the HTTP Flowchart

It has long bugged many of the Webmachine hackers that this relationship with Alan Dean’s HTTP flowchart is one-way. Webmachine was made from that graph, but that graph wasn’t made from Webmachine. I decided to change that in my evenings last week.

Webmachine hackers are familiar with a certain flowchart representing the decisions made during the processing of an HTTP request. Webmachine was designed as a practical executable form of that flowchart.

It has long bugged many of the Webmachine hackers that this relationship is one-way, though. Webmachine was made from the graph, but the graph wasn’t made from Webmachine. I decided to change that in my evenings last week, while trying to take my mind off of Riak 1.0 testing.

This is a version of the HTTP flowchart that only a Webmachine hacker could love. It’s ugly and missing some information, but the important part is that it’s generated by parsing webmachine_decision_core.erl.

I’ve shared the code for generating this image in the gen-graph branch of my webmachine fork. Make sure you have Graphviz installed, then checkout that branch and run make graph && open docs/wdc_graph.png.

In addition to the PNG, you’ll also find a docs/wdc_graph.dot if you prefer to render to some other format.

If you’d really like to dig in, I suggest firing up an Erlang node and looking at the output of wdc_graph:parse("src/webmachine_decision_core.erl"):

[{v3b13, [ping],                     [v3b13b,503]},
 {v3b13b,[service_available],        [v3b12,503]},
 {v3b12, [known_methods],            [v3b11,501]},
 {v3b11, [uri_too_long],             [414,v3b10]},
 {v3b10, [allowed_methods,'RESPOND'],[v3b9,405]},
 {v3b9,  [malformed_request],        [400,v3b8]},
...

If you’ve looked through webmachine_decision_core at all, I think you’ll recognize what’s presented above: a list of tuples, each one representing the decision named by the first element, with the calls made to a resource module as the second element, and the possible outcomes as the third element. Call wdc_graph:dot/2 to convert those tuples to a DOT file.

There are a few holes in the generation. Some response codes are reached by decisions spread across the graph, causing long arrows to cross confusingly. The edges between decisions aren’t labeled with the criteria for following them. Some resource calls are left out (like those made from webmachine_decision_core:respond/1 and the response body producers and encoders). It’s good to have a nice list for future tinkering.

NerdKit Gaming: Part 2

If you were interested in my last bit of alternative code-geekery, you may also be interested to hear that I’ve pushed that NerdKit Gaming code farther. If you browse the github repository now, you’ll find that the game also includes a highscore board, saved in EEPROM so it persists across reboot. It also features a power-saving mode that kicks in if you don’t touch any buttons for about a minute. Key-repeat now also allows the player to hold a button down, instead of pressing it repeatedly, in order to move the cursor multiple spaces.

You may remember that I left of my last blog post noting that there wasn’t much left for the game until I could find a way to slim down the code to fit new things. So what allowed these new features to fit?

Well, I did find ways to slim down the code: I was right about making the game state global. But, I also re-learned a lesson that is at the core of hacking: check your base assumptions before fiddling with unknowns. In this case, my base assumption was the Makefile I imported from an earlier NerdKits project. While making the game state global saved a little better than 1k of space, changing the Makefile such that unused debugging utilities, such as uart, printf, scanf weren’t linked in saved about 6k.

In that learning, I also found that attempting to out-guess gcc’s “space” optimization is a losing game. Making the game state global had a positive effect on space, but making the button state global had a negative effect. Changing integer types would help in one place, but hurt in others. I’m not intimately familiar with the rules of that optimizer, so it felt like spining a wheel of chance choosing which thing to prod next.

You may notice that I ultimately returned the game state to a local variable, passed in and out of each function that needed it. The reason for this was testability. It’s simply easier to test something that doesn’t depend on global state. Once I had a bug that required running a few specific game states through these functions repeatedly, it just made sense to pay the price in program space in order to be able to write unit tests to cover some behaviors.

So now what’s next? This time, it’s not much until I buy a new battery. So much reloading and testing finally drained the original 9V. Once power is restored, I’ll probably dig into some new peripheral … maybe something USB?