Last week, Peter Bailis announced a new tool for smoothing timeseries data for plotting, called ASAP:
New preprint and demo for ASAP: automatic smoothing for time series visualization https://t.co/GnvhBAgf5B
— Peter Bailis (@pbailis) March 6, 2017
Since I just happened to have some fresh data handy, I decided to try it out. You may recall this graph of the difference in pressure measured between a sensor submerged in fermenting beer, and a sensor in open air:

That graph was based on data that was pre-smoothed using a windowed average provided by the Helium API. The window is one hour, which produced 190 points. Zooming in just a little bit takes us to a 30 minute window, at 303 points:

Unlike the other graphs I plotted in that series, I didn't plot the maximum and minimum on this one. Here is what the raw data looks like:

That is 11282 points. There are spikes several times taller than what looks like the "typical" variation. The question, then, is did the windowed average portray this data accurately? Here are 303 points overlayed on the 11282 (I cut off the peaks to give us a little more detail):

I think it seems reasonable. A little noisy, but visually following the mid-point of the band. Can ASAP do better? Here are the 304 points it gives when asked for 303 from this dataset:

If you look closely, you can see a few blue edges from the windowed average peaking out from under the ASAP plot, but they're largely the same. ASAP hasn't surfaced any additional features in this data that the plain windowed average hid. I think that's not surprising. The process that was being measured was a slow, continuous change, and not something where there should have been sudden changes in behavior.
So instead, let's look at some data with an anomaly, like the absolute-force graph from the tilt sensor:

That was using window-averaged data as well. Let's plot it against raw data and ASAP like before:

That's strange. The blue window-averaged line follows the raw data
pretty well, at the 303 points I asked for. The ASAP line looks
weirdly off, though. The smooth
function only
returned 279 points, and it's caused by a gap in the middle. That
straight line around the anomaly contains 30 points. The fact that
the curve to the left of that line looks like it's shifted in time
makes me suspicious, but there seem to be no errors
generated. Even if I bump up the resolution to the full 890 pixels
of this SVG, the ASAP curve looks like this, and produces 31 fewer
points than the windowed-average.
The data and code I'm using to plot it are available in this gist: https://gist.github.com/beerriot/5e343e35e4930947fce77f36f1f5fbe5
Categories: Development
Post Copyright © 2017 Bryan Fink