Advent of Code, in Erlang: Day 23

Published Saturday, December 25, 2021 by Bryan

Between a complicated Day 23 puzzle, and last-minute holiday gift work (and holiday celebration!), I've fallen behind in Advent of Code. But, I do finally have a solution for Day 23, so let's dig in.

The Day 23 puzzle reminded me of a completely different aspect of college. At the end of every term, my fraternity would run a "rooming game" to allow people to switch up which room of the house they were living in. The time between that day and the start of the next term was always a frenzy of trying to figure out who needed to move next to unblock the rest of the moves. The resemblance to the puzzle is really too strong: mostly two-person rooms, limited hallway space, some people with very little to move and others with lots, even refusals to move in until everyone in the room had moved out! Several times it devolved into engineers standing around a chalkboard drawing dependency graphs.

Pretty much every one of those moves eventually involved breaking a rule the amphipods of Day 23's puzzle have to live by. So, it was never really worth it to figure out the full, cheapest solution. But today, I got to live out the fantasy.

I didn't see much strategy at first, so I decided to use the graph-exploring hammer I just relearned on Day 15: A*. The star stands for "mphipod", right?

On Day 15, the decision at each step was only which direction to move the submarine next. Today, the decision is not only where to move an amphipod, but which amphipod to move. So, the option generator has to do a little more work to assemble the list of next steps. For each amphipod, it has to consider where it is, where it's going, and whether any other amphipod is in the way. This leads to step 1 producing 28 initial options: each of the amphipods at the hallway end of each room can move into any of the hallway positions. The next step after each of those initial options produce slightly fewer options, because one of the hallway spaces is taken, but the tree still gets wide quickly.

Finding ways to reduce this explosion became core to making the algorithm run fast enough to produce results. My biggest time wins came from two stages of the same realization. Stage 1: if an amphipod can move into its final room, it is not worth considering any other move for it. This primarily affects amphipods that are still in the wrong room. If there are no amphipods blocking their way in the hallway or their target room, then they have no reason to consider stopping at any hallway position. Stage 2: if any amphipod can move into its final room, it is not worth considering any other move for any amphipod. This collapses graph nodes quickly, because it doesn't even require adding the option back into the sorted list. When a room move becomes legal, make the move, and then consider what moves are legal after that. But also, if the room move reveals another legal room move, do that one immediately too, and continue until a state where only hallway moves are available.

While datastructure optimizations and some estimate accuracy improvements lowered my code's runtime on the Part 1 example from 60-something seconds to about 20 seconds, Stage 1 took it to 2 seconds. Stage 1 was also responsible for finally brining the runtime on the real Part 1 input from I-don't-know-how-long-because-I-didn't-wait to an I-would-have-canceled-had-I-not-had-debug-output-showing-progress 160 seconds. Stage 2, almost unbelievably, took the real Part 1 runtime from 160 second to 160 milliseconds.

I never ran the earlier versions of my algorithm against the Part 2 input. As it was so slow, I actually solved Part 1 by hand, with iPad and Pencil, before I got my code to print the right answer. So, I didn't even make the modifications to enable evaluating Part 2 until I had gotten Part 1's runtime into the sub-second range. That was the right choice. Even with Stage 2 in place, the Part 2 example input took over 8.5 minutes to solve. The real Part 2 input took nearly 12 minutes. If the 1000x savings were applied in reverse, these would have taken 6-8 days to run on the pre-optimized code - they still wouldn't be done now!

I've only linked to, and not shown, code in today's post yet, because it's actually a fair amount of code to make this all work. Knowing how much trouble I had getting A* to work the first time in this series, and how much more complicated the movement rules were this time, I went a little overboard in breaking each piece down. For example, the distance from any space, room or hallway, to any other space (used to calculate and estimate cost), is its own pair of functions:

distance({hallway, H1}, {hallway, H2}) ->
    abs(H1-H2);
distance({room, R, TB}, {hallway, H}) ->
    TB + distance({hallway, hallway_outside_room(R)}, {hallway, H});
distance({hallway,_}=H, {room, _, _}=R) ->
    distance(R, H);
distance({room, R, TB1}, {room, R, TB2}) ->
    abs(TB1-TB2);
distance({room, R1, TB1}, {room, R2, TB2}) ->
    TB1 + TB2 + distance({hallway, hallway_outside_room(R1)},
                         {hallway, hallway_outside_room(R2)}).

hallway_outside_room(a) -> 2;
hallway_outside_room(b) -> 4;
hallway_outside_room(c) -> 6;
hallway_outside_room(d) -> 8;
hallway_outside_room({room, Type, _}) -> hallway_outside_room(Type);
hallway_outside_room(_) -> false.

But, breaking the code down in this way means that they very particular conditions to find legal moves can be separated easily from the code that actually runs the branch-extending solution loop:

solve([{BestCost, Est, Best}|Options], Seen, RoomSize, Debug) ->
    case Debug rem 1000 of
        0 ->
            io:format("length(Options) = ~p "
                      "size(Seen) = ~p "
                      "BestCost = ~p "
                      "Total = ~p~n",
                      [length(Options), maps:size(Seen), BestCost,
                       BestCost+Est]);
        _ ->
            ok
    end,
    case Seen of
        #{Best := LowerCost} when LowerCost < BestCost ->
            %% we found a cheaper path to this state since this option
            %% was put in the list
            solve(Options, Seen, RoomSize, Debug+1);
        _ ->
            {RoomCost, RoomOnly} =
                make_room_only_moves(BestCost, Best, RoomSize),
            %% cost comparison s efficiency only: we know Best wasn't finished
            case (RoomCost =/= BestCost) andalso finished(RoomOnly) of
                true ->
                    {RoomCost, RoomOnly};
                false ->
                    NewOptions = all_legal_moves(RoomCost, RoomOnly, RoomSize),
                    {CheapOrNew, NewSeen} =
                        update_and_filter_seen(NewOptions, Seen),
                    solve(lists:merge(fun sort_total_path_cost/2,
                                      lists:sort(fun sort_total_path_cost/2,
                                                 CheapOrNew),
                                      Options),
                          NewSeen, RoomSize, Debug+1)
            end
    end.

That Debug printout is what kept me hanging on for the Part 2 solution. The first few 1000-iteration steps went like so:

length(Options) = 0 size(Seen) = 1 BestCost = 0 Total = 34200
length(Options) = 1274 size(Seen) = 2275 BestCost = 2205 Total = 34202
length(Options) = 2804 size(Seen) = 4805 BestCost = 5007 Total = 34204
length(Options) = 4623 size(Seen) = 7624 BestCost = 11056 Total = 34222
length(Options) = 4958 size(Seen) = 8959 BestCost = 2293 Total = 34280
length(Options) = 5738 size(Seen) = 10739 BestCost = 2700 Total = 34400
length(Options) = 6326 size(Seen) = 12327 BestCost = 2703 Total = 34402

The numbers grew for minutes. These are:

length(Options)
The number of states currently under consideration (in the sorted list, waiting to become the cheapest option).
size(Seen)
The number of unique states that have been seen (the size of the map listing the cheapest cost observed to arrive at a state).
BestCost
The cost to arrive at the current state being considered (the state with the cheapest spent+estimated cost).
Total
The estimated total cost (including the already-spent cost) to get from the current state being considered to a solution.

Occasionally the length of my list of options to consider would fall slightly, only to grow again a few lines later. The peak, 304 debug lines later was here:

length(Options) = 60295 size(Seen) = 360654 BestCost = 9503 Total = 36802
length(Options) = 60474 size(Seen) = 361833 BestCost = 5503 Total = 36802
length(Options) = 60761 size(Seen) = 363120 BestCost = 5943 Total = 36802
length(Options) = 60379 size(Seen) = 363738 BestCost = 4106 Total = 36804
length(Options) = 60167 size(Seen) = 364526 BestCost = 12806 Total = 36804

There were a couple of short upward swings for another minute or two, but the overall trend was downward from there, until finally it fell all the way down to the last few lines before the answer:

length(Options) = 794 size(Seen) = 810020 BestCost = 10449 Total = 45614
length(Options) = 529 size(Seen) = 810755 BestCost = 12595 Total = 45832
length(Options) = 1314 size(Seen) = 812540 BestCost = 10408 Total = 46212
length(Options) = 1755 size(Seen) = 813981 BestCost = 8462 Total = 46236

The final answer put the total cost just over 47000 (exact number redacted to avoid spoilers for those working on the puzzle now). I think it's interesting to see that the BestCost, even just before the answer was found, was still at most a quarter of the total cost. This might mean that this game goes a bit like FreeCell - very few amphipods (or cards) get to their correct position, until suddenly they all do. Or it might mean that solutions that move high-cost "D" amphipods aren't really considered until all ways to move low-cost amphipods have been exhausted.

Seeing the Total, which includes both the spent cost and the estimated remaining cost, begin at about 75% of the actual total cost, and then rise slowly to 98% might mean there is some accuracy that could be added to the estimation function. Before figuring out the room-move shortcut, I did get a nice speed boost in the example solutions by making an accuracy improvement there. I had started by saying that any amphipod in its correct room had an estimated move cost of zero. This isn't true if there are amphipods that don't belong in that room below it. Adding the cost for moving out and back in for such amphipods cut the example solution time by 50% early on.

I won't go over the rest of the solution code here, but if you'd like to look at it, I've put it on github with my other puzzle solutions.

It's Christmas Day, and I haven't started the Day 24 or 25 puzzles yet. I do hope to get to them, but now that gifts that I made this year are being opened, I also have some woodworking posts to write. If you're interested in either, make sure to add the RSS feed to your reader, or watch for my tweets, to get a notification when they're posted.