Advent of Code, in Erlang: Day 15

Published Wednesday, December 15, 2021 by Bryan

Remember on Day 12, when I said I wasn't particularly well versed in path plotting? Welcome to Advent of Code 2021, Day 15, where that's the name of the game. We have to find the "path of least risk" through a grid of tiles with varying risk levels. This one took me a while.

Even though it looked like a bad idea from the start, I did whip up yet another depth-first traversal attempt at solving this one. Unsuprisingly, it was too slow even for the 10x10 example grid. An obvious waste of time, but it gave my subconscious time to dredge up the one path-finding strategy it could remember: A*.

A* is sort of like breadth-first search, in that it extends down all branches at once, but with a hint giving priority to branches that look cheaper. That is, from the start position, the algorithm calculates the total cost for each possible next step, and then moves to the one with the lowest (estimated) cost. All possible steps from that new position are calculated, and then those costs, as well as the un-followed paths from previous positions are considered for the next move. That is, the algorithm will choose to extend the cheapest path discovered so far. So, as the search progresses, multiple paths will be slowly extended across the grid, as each one takes its turn being cheapest, until one of them reaches the goal.

Note that the description mentions and estimate. Part of the cost function for choosing a path is the cost incurred so far, but another part is a heuristic, an estimate, of how much cost there is left. To guarantee that this function finds the path of least risk, and not just a path of relatively low risk, that heuristic must have one important feature: it can't overestimate the remaining cost.

distance_to_goal({X,Y}, {{Width, Height},_}) ->
    (((Width-X)+(Height-Y))-2).

My heuristic is the distance to the goal, as the number of steps (also known as Manhattan distance or Taxicab geometry). This is basically an optimisitic, "If every position from here has the lowest risk possible (1), this is the remaining cost." Anything else could be an overestimate, if there is actually a path that is all lowest-risk.

My implementation was trivial to start with: keep a list of paths sorted by their expected cost, at each iteration check to see if we've reached the goal, and extend the cheapest path (the one at the front of the sorted list) if we haven't.

find_lowest_risk({{Width,Height},_}=Territory,
                 [Cheapest|Rest],
                 CostMatrix) ->
    {[{LX, LY}|_], Risk, _} = Cheapest,
    case {Width-LX, Height-LY} of
        {1,2} ->
            %% just north of end
            {Risk + risk_at({Width-1, Height-1}, Territory), Cheapest};
        {2,1} ->
            %% just west of end
            {Risk + risk_at({Width-1, Height-1}, Territory), Cheapest};
        _ ->
            {NewPaths, NewCostMatrix} =
                advance_path(Cheapest, Territory, CostMatrix),
            find_lowest_risk(Territory,
                             add_paths(NewPaths, Rest),
                             NewCostMatrix)
    end.

This function starts by pulling the Cheapest path from the Rest of the sorted-by-total-risk path list. If its position (LX,LY) is next to the goal, we add that path's Risk to the risk_at the goal, and finish the search by returning the total non-estimated risk (as well as the path for debugging). If we're not next to the goal, the advance_path call computes all possible next steps from the current position on the Cheapest path, and add_paths call puts them in their correct sorted position among the other paths under consideration. Then we start over with whatever is the cheapest path now.

This works for the 10x10 solution. I could tell it wasn't efficient from the short hesitation between me pressing the return key, and the result printing in the terminal. Pressing return on the full 100x100 puzzle input did not return a result in a minute. And so began a couple of hours of experimentation.

I was worried that I'd have trouble with efficiency here, because a number of implementations place a high prioirty on the sorting of paths being efficient. Heap implementations are popular, and I didn't have one. I was using a plain sorted list. Funny enough, I went overboard on my initial implementation, writing a custom function to insert the new paths into the sorted path list with as few whole-list copies as I thought I could manage. At some point, I thought, "Maybe there's a bug here that's causing an infinite loop," and went looking for a library implementing it. That's when I finally noticed lists:merge/3 in the standard library.

add_paths(A, B) ->
    lists:merge(fun order_paths/2, A, B).

order_paths({_,VA,DA}, {_,VB,DB}) ->
    (VA+DA) =< (VB+DB).

Switching to the built-in merge caused a huge performance boost … in the first few steps of my runs. As I watched debugging output, though, I could see that as the search progressed, it still got slower and slower. Why? Well, the number of possible paths was growing, so managing it was taking more time, but I shouldn't care too much about that, because the cheap paths should stay near the start, right? Looking at more debugging output revealed that all of the thousands of possible paths had estimated costs within 15 points of each other. Adding even a modest risk of 4 to a cheap path would push it hundreds of paths deep in the queue, and the algorithm would go off extending other paths before finally coming back to it.

I spent probably an hour trying to improve the heuristic. Was there any way to produce a more accurate estimate, without overestimating? How much could I overestimate, but still get the right answer anyway? Multiplying the heuristic by two cut time on the 10x10 example dramatically. But multiplying by three gave the wrong answer. Neither had any effect on the 100x100 input. What about multiplying by an average of the grid between this point and the goal? Well, the average for nearly any point in the 10x10 example is 3, which gives the wrong answer. The average in nearly any point in the 100x100 input is 5. It still didn't give any answer anyway.

But, the change of multiplying the heuristic by two did have another interesting effect on the 10x10 search: it reduced the number of paths under consideration. That was my goal - were there other ways I could accomplish it? Yes! I wasn't checking whether the path was looping back on itself. In theory, this would just put it farther back in the queue, as it incurred unneeded cost. But, if the problem I was having was actual cheap paths getting bumped down the queue, the existence of this pointless path was a problem. Checking previous points on this path would get more expensive as the path lengthened, but the time could be worth it, if it saved time elsewhere. And in fact, it did exactly that. Not enough for 100x100, but it was a good direction.

When I realized that I was throwing out a path that looped back on itself, but wasn't considering whether another path had reached that point, I initially threw the thought away. It would be way too expensive to look through every path in the path list every time I added a step to one path. But then a lightbulb went on: I didn't care if another path had been there, I cared if another path had been there more efficiently. What I needed was a companion map, showing not the risk of each point in the grid, but the lowest path risk I had found to that point so far! Rereading the Wikipedia page on A* now, I see that this is mentioned in the psuedo code. I didn't get that far at 1am. Sigh.

init_cost_to_point({{Width,Height},_}) ->
    [ [ 9*(X+Y) || X <- lists:seq(0, Width-1) ]
      || Y <- lists:seq(0, Height-1) ].

review_cost_to_point({X,Y}, NewCost, CostMatrix) ->
    {Above, [Row|Below]} = lists:split(Y, CostMatrix),
    {Left, [Elt|Right]} = lists:split(X, Row),
    case NewCost < Elt of
        true ->
            {change, Above ++ [ (Left ++ [NewCost|Right]) | Below ]};
        false ->
            nochange
    end.

advance_path({[{LX,LY}|_]=Path, Risk, _},
             {{Width, Height},_}=Territory,
             CostMatrix) ->
    NewPoints = [ {X,Y}
                  || {X,Y} <- [{LX+1,LY},{LX-1,LY},
                               {LX,LY+1},{LX,LY-1}],
                     X >= 0, X < Width,
                     Y >= 0, Y < Height ],
    NewPaths = [ {[Step|Path],
                  Risk + risk_at(Step, Territory),
                  distance_to_goal(Step, Territory)}
                 || Step <- NewPoints ],
    {Cheaper, NewMatrix} =
        lists:foldl(fun({[Step|_], R, _}=P, {AccP, AccM}) ->
                            case review_cost_to_point(Step, R, AccM) of
                                {change, NewAccM} ->
                                    {[P|AccP], NewAccM};
                                nochange ->
                                    {AccP, AccM}
                            end
                    end,
                    {[], CostMatrix},
                    NewPaths),
    {lists:sort(fun order_paths/2, Cheaper),
     NewMatrix}.

The companion map is the mysterious CostMatrix you may have noticed in the find_lowest_risk function. I initialize it with the maximum possible risk that a direct path to each point could ever find: if all grid cells held a maximum risk of 9. When advance_path creates its list of NewPaths from the current position, it reviews their accumulated risk against the value at the new point in CostMatrix. If this path has a lower risk, we update CostMatrix and keep the path to consider in the future. If the path has equal or greater risk, we throw it out. (If the risk is equivalent to one already found, there is another low-risk path in the consideration list, and since we only care about finding one lowest-cost path, we don't need to keep both around.) What I think is neat is that, not only does this sidestep the inefficiency of checking whether the new point is on any other path, it also sidesteps the inefficiency of checking whether this point is on this path!

SmallT = puzzle15:parse_input(Example).
timer:tc(puzzle15, find_lowest_risk, [SmallT]).
% {154, {40,   …}}
%  ^us   ^risk ^path
timer:tc(puzzle15, solveA, []).
% {101232, …}
%  ^us     ^no spoilers
%%

The first time I pressed enter with this "least cost matrix" in place, it returned so fast, I thought I must have left my debugging breakpoints in place. Nope. Once you get rid of paths that are already known to be less efficient than others, A* is quick - around 100milliseconds to solve the 100x100 puzzle input.

Part 2 is just a simple, "Turn it up to 11" extension of the problem. Instead of the grid being 100x100, it's now 500x500. Twenty-five times as many points - can your algorithm handle it?

timer:tc(puzzle15, solveB, []).
% {13937028, …}
%  ^us       ^no spoilers
%%

Fourteen seconds was just long enough for me to consider stopping the run to look for more inefficiencies. A 14x increase in runtime for a 25x larger space. I'll take it for now. If I were to try to squeeze more time out of the run, I think the place I would start is with applying the cost matrix in the other direction. I'm now rejecting new paths that aren't better than what already exists, but I'm not yet retroactively filtering out existing paths if a new one is found that reaches any of its points more cheaply. That is, not until those paths are selected for extension, and only then if their next step happens to be more expensive than something else already found. Or maybe such a filter would be too expensive in its own right.

The worst part about this misdirect is that I started the work hoping that I'd get to show off my risk-map representation, which I did as one big binary instead of the list-of-row-lists I've used a lot here. I think this post is long enough as it is, though, so I'll leave you to investigate parse_input/1, expand_territory/2, and risk_at/2 on your own.

I also need to publish a little earlier than I have been, and go prepare for a storm forecasted to be potentially severe here tonight. We may be looking at winds over 50mph. This might be the test of the math I did for 80mph winds interacting with my Starlink tower. Wish me luck in having power and internet to work on Day 16!