Remember on Day 12, when I said I wasn't particularly well versed in path plotting? Welcome to Advent of Code 2021, Day 15, where that's the name of the game. We have to find the "path of least risk" through a grid of tiles with varying risk levels. This one took me a while.

Even though it looked like a bad idea from the start, I did whip up yet another depth-first traversal attempt at solving this one. Unsuprisingly, it was too slow even for the 10x10 example grid. An obvious waste of time, but it gave my subconscious time to dredge up the one path-finding strategy it could remember: A*.

A* is sort of like breadth-first search, in that it extends down all
branches at once, but with a hint giving priority to branches that
look cheaper. That is, from the start position, the algorithm
calculates the total cost for each possible next step, and then
moves to the one with the lowest (estimated) cost. All possible
steps from that new position are calculated, and then those
costs, *as well as the un-followed paths from previous
positions* are considered for the next move. That is, the
algorithm will choose to extend the cheapest path discovered so
far. So, as the search progresses, multiple paths will be slowly
extended across the grid, as each one takes its turn being cheapest,
until one of them reaches the goal.

Note that the description mentions and estimate. Part of the cost
function for choosing a path is the cost incurred so far, but
another part is a heuristic, an estimate, of how much cost there is
left. To guarantee that this function finds the path of *least*
risk, and not just a path of relatively low risk, that heuristic
must have one important feature: it can't overestimate the remaining
cost.

My heuristic is the distance to the goal, as the number of steps (also known as Manhattan distance or Taxicab geometry). This is basically an optimisitic, "If every position from here has the lowest risk possible (1), this is the remaining cost." Anything else could be an overestimate, if there is actually a path that is all lowest-risk.

My implementation was trivial to start with: keep a list of paths sorted by their expected cost, at each iteration check to see if we've reached the goal, and extend the cheapest path (the one at the front of the sorted list) if we haven't.

This function starts by pulling the `Cheapest` path from
the `Rest` of the sorted-by-total-risk path list. If its
position (`LX,LY`) is next to the goal, we add that
path's `Risk` to the `risk_at` the goal, and finish
the search by returning the total non-estimated risk (as well as the
path for debugging). If we're not next to the goal,
the `advance_path` call computes all possible next steps from
the current position on the `Cheapest` path,
and `add_paths` call puts them in their correct sorted
position among the other paths under consideration. Then we start
over with whatever is the cheapest path now.

This works for the 10x10 solution. I could tell it wasn't efficient from the short hesitation between me pressing the return key, and the result printing in the terminal. Pressing return on the full 100x100 puzzle input did not return a result in a minute. And so began a couple of hours of experimentation.

I was worried that I'd have trouble with efficiency here, because a
number of implementations place a high prioirty on the sorting of
paths being efficient. Heap implementations are popular, and I
didn't have one. I was using a plain sorted list. Funny enough, I
went overboard on my initial implementation, writing a custom
function to insert the new paths into the sorted path list with as
few whole-list copies as I thought I could manage. At some point, I
thought, "Maybe there's a bug here that's causing an infinite loop,"
and went looking for a library implementing it. That's when I
finally
noticed `lists:merge/3`
in the *standard* library.

Switching to the built-in `merge` caused a huge performance
boost … in the first few steps of my runs. As I watched debugging
output, though, I could see that as the search progressed, it still
got slower and slower. Why? Well, the number of possible paths was
growing, so managing it was taking more time, but I shouldn't care
too much about that, because the cheap paths should stay near the
start, right? Looking at more debugging output revealed that all of
the thousands of possible paths had estimated costs within 15 points
of each other. Adding even a modest risk of 4 to a cheap path would
push it hundreds of paths deep in the queue, and the algorithm would
go off extending other paths before finally coming back to it.

I spent probably an hour trying to improve the heuristic. Was there any way to produce a more accurate estimate, without overestimating? How much could I overestimate, but still get the right answer anyway? Multiplying the heuristic by two cut time on the 10x10 example dramatically. But multiplying by three gave the wrong answer. Neither had any effect on the 100x100 input. What about multiplying by an average of the grid between this point and the goal? Well, the average for nearly any point in the 10x10 example is 3, which gives the wrong answer. The average in nearly any point in the 100x100 input is 5. It still didn't give any answer anyway.

But, the change of multiplying the heuristic by two did have another interesting effect on the 10x10 search: it reduced the number of paths under consideration. That was my goal - were there other ways I could accomplish it? Yes! I wasn't checking whether the path was looping back on itself. In theory, this would just put it farther back in the queue, as it incurred unneeded cost. But, if the problem I was having was actual cheap paths getting bumped down the queue, the existence of this pointless path was a problem. Checking previous points on this path would get more expensive as the path lengthened, but the time could be worth it, if it saved time elsewhere. And in fact, it did exactly that. Not enough for 100x100, but it was a good direction.

When I realized that I was throwing out a path that looped back on
itself, but wasn't considering whether *another* path had
reached that point, I initially threw the thought away. It would be
way too expensive to look through every path in the path list every
time I added a step to one path. But then a lightbulb went on: I
didn't care if another path had *been* there, I cared if
another path had been there *more efficiently*. What I needed
was a companion map, showing not the risk of each point in the grid,
but the lowest path risk I had found to that point so far! Rereading
the Wikipedia page on A* now, I see that this is mentioned in the
psuedo code. I didn't get that far at 1am. *Sigh.*

The companion map is the mysterious `CostMatrix` you may have
noticed in the `find_lowest_risk` function. I initialize it
with the maximum possible risk that a direct path to each point
could ever find: if all grid cells held a maximum risk of
9. When `advance_path` creates its list of `NewPaths`
from the current position, it `review`s their accumulated
risk against the value at the new point in `CostMatrix`. If
this path has a lower risk, we update `CostMatrix` and keep
the path to consider in the future. If the path has equal or greater
risk, we throw it out. (If the risk is equivalent to one already
found, there is another low-risk path in the consideration list, and
since we only care about finding one lowest-cost path, we don't need
to keep both around.) What I think is neat is that, not only does
this sidestep the inefficiency of checking whether the new point is
on any other path, it also sidesteps the inefficiency of checking
whether this point is on *this* path!

The first time I pressed enter with this "least cost matrix" in place, it returned so fast, I thought I must have left my debugging breakpoints in place. Nope. Once you get rid of paths that are already known to be less efficient than others, A* is quick - around 100milliseconds to solve the 100x100 puzzle input.

Part 2 is just a simple, "Turn it up to 11" extension of the problem. Instead of the grid being 100x100, it's now 500x500. Twenty-five times as many points - can your algorithm handle it?

Fourteen seconds was *just* long enough for me to consider
stopping the run to look for more inefficiencies. A 14x increase in
runtime for a 25x larger space. I'll take it for now. If I were to
try to squeeze more time out of the run, I think the place I would
start is with applying the cost matrix in the other direction. I'm
now rejecting new paths that aren't better than what already exists,
but I'm not yet retroactively filtering out existing paths if a new
one is found that reaches any of its points more cheaply. That is,
not until those paths are selected for extension, and only then if
their next step happens to be more expensive than
something *else* already found. Or maybe such a filter would be
too expensive in its own right.

The worst part about this misdirect is that I started the work
hoping that I'd get to show off my risk-map representation, which I
did as one big binary instead of the list-of-row-lists I've used a
lot here. I think this post is long enough as it is, though, so I'll
leave you to
investigate `parse_input/1`, `expand_territory/2`,
and `risk_at/2` on your own.

I also need to publish a little earlier than I have been, and go prepare for a storm forecasted to be potentially severe here tonight. We may be looking at winds over 50mph. This might be the test of the math I did for 80mph winds interacting with my Starlink tower. Wish me luck in having power and internet to work on Day 16!

Categories: Development AdventOfCode

Post Copyright © 2021 Bryan Fink