ASAP Smoothing

Last week, Peter Bailis announced a new tool for smoothing timeseries data for plotting, called ASAP:

Since I just happened to have some fresh data handy, I decided to try it out. You may recall this graph of the difference in pressure measured between a sensor submerged in fermenting beer, and a sensor in open air:

screen-shot-2017-02-27-at-9-28-05-pm

Helium-smoothed pressure data (190 points)

That graph was based on data that was pre-smoothed using a windowed average provided by the Helium API. The window is one hour, which produced 190 points. Zooming in just a little bit takes us to a 30 minute window, at 303 points:

Screen Shot 2017-03-11 at 12.20.49 PM

Window-averaged pressure data (303 points)

Unlike the other graphs I plotted in that series, I didn’t plot the maximum and minimum on this one. Here is what the raw data looks like:

Screen Shot 2017-03-11 at 4.49.51 PM

Raw pressure data (11282 points)

That is 11282 points. There are spikes several times taller than what looks like the “typical” variation. The question, then, is did the windowed average portray this data accurately? Here are 303 points overlayed on the 11282 (I cut off the peaks to give us a little more detail):

Screen Shot 2017-03-11 at 12.31.02 PM

Raw pressure data (grey) with Window-averaged data (blue)

I think it seems reasonable. A little noisy, but visually following the mid-point of the band. Can ASAP do better? Here are the 304 points it gives when asked for 303 from this dataset:

Screen Shot 2017-03-11 at 12.37.14 PM

Raw data (grey), window-averaged (blue), ASAP (red)

If you look closely, you can see a few blue edges from the windowed average peaking out from under the ASAP plot, but they’re largely the same. ASAP hasn’t surfaced any additional features in this data that the plain windowed average hid. I think that’s not surprising. The process that was being measured was a slow, continuous change, and not something where there should have been sudden changes in behavior.

So instead, let’s look at some data with an anomaly, like the absolute-force graph from the tilt sensor:

screen-shot-2017-02-25-at-7-08-08-pm

Helium-smoothed force data (dark blue) with min-max area (light blue)

That was using window-averaged data as well. Let’s plot it against raw data and ASAP like before:

Screen Shot 2017-03-11 at 4.28.48 PM

Raw force data (grey), window-averaged (blue), ASAP (red)

That’s strange. The blue window-averaged line follows the raw data pretty well, at the 303 points I asked for. The ASAP line looks weirdly off, though. The smooth function only returned 279 points, and it’s caused by a gap in the middle. That straight line around the anomaly contains 30 points. The fact that the curve to the left of that line looks like it’s shifted in time makes me suspicious, but there seem to be no errors generated. Even if I bump up the resolution to the full 890 pixels of this SVG, the ASAP curve looks like this, and produces 31 fewer points than the windowed-average.

The data and code I’m using to plot it are available in this gist: https://gist.github.com/beerriot/5e343e35e4930947fce77f36f1f5fbe5

Off to ping Peter…

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: