Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Day 20: My favourite problem from Advent of Code 2023 (mliezun.github.io)
82 points by nickdevx on Dec 30, 2023 | hide | past | favorite | 47 comments


I don't really like problems like these. I love Advent of Code and have got 50/50 stars on Christmas day this year -- but this type of problem grinds my gears.

The intended solution only works because the input is more constrained than what the problem statement says (the problem in full generality is PSPACE-hard). If you give me a problem to solve, I'd rather have all the hypotheses, all at once.


The same is true with part 2 of problem 8 and problem 21 which both are generally much harder problems which are made easier by carefully constructed test cases. I also find it somewhat annoying, but par for the course with advent given the same test data for part 1 & 2; and the fact that you actually see the only test case -- it's not hidden test cases like in other programming competitions.


I'm behind this year, so I haven't looked at any postmortems, but I found an interesting solution to problem 8 part 2[1] after watching the brute force methods stall out. Could you explain what you mean by carefully constructed test cases?

[1] https://github.com/flurie/aoc-rust/blob/main/src/bin/08.rs#L...


It's easy to construct inputs for which the correct solution would not be the LCM of the cycle lengths. (Just take the input you got, and insert a few extra steps between the "start" point for one of the "ghosts" and the point where it enters its otherwise unaltered cycle.)

A more general solution is possible -- but it does require a trick that people may not be aware of unless they've studied a little number theory.


It gets even more interesting if you start worrying about ghosts that could finish at multiple points within each cycle. There are potentially too many combinations to try out each one individually.


Would you use Chinese remainder theory on all possible configurations of the cycles, or something else completely?


I actually wrote the code to solve it in full generality (https://github.com/edoannunziata/jardin/blob/master/misc/Aoc...) -- CRT is the idea but it's actually the "generalized" case of CRT, because you could have moduli that are not coprime. So you have to use the CRT "in reverse" to break each linear congruence into its components (using the fact that \mathbb{Z}_m \times \mathbb{Z}_n \cong \mathbb{Z}_{mn} \iff \gcd(m, n) = 1) and then use the CRT again to find the answer.

Annoying to code, but I've done it a thousand times by hand for math competitions, so it's kind of carved into my skull.

There might be multiple cases, but I don't think there's anything better (please let me know if there is, I'm very interested) -- as a matter of fact, an instance of the original problem can be used to encode an instance of a system of linear congruences, so finding a better algorithm for the latter would imply a better algorithm for the former.


It's possible to answer "find the minimum n such that n%a is in s1 and n%b is in s2, assuming a and b are coprime" in runtime around O(n1+log(n1)*n2) where n1 is the size of s1 and n2 is the size of s2.

n1*n2 can get at least as big as a*b/8 without a naive linear search being guaranteed to be fast, so this roughly square roots the runtime of 'CRT on all pairs' if n1 is close to n2.

Here's the code, sorry it's not as well commented as yours, and I haven't integrated it into a solution to day 8.

https://github.com/penteract/adventofcode/blob/master/utils/... .

For more than 2 moduli, the best I can think of is to separate them into 2 roughly equal parts, generate the full lists of allowed remainders for each part, then use this algorithm to combine the parts. There may be better things you can do here (and perhaps you can show that for large sets of allowed remainders across enough different moduli, the solutions are spread out evenly enough there's one you can find by naive linear search).

Dealing with non-coprime moduli is left as an exercise to the reader :).


Wow, thanks! I'll check it out.


Great, thanks!


So in Question 8 part 2 you had multiple cycles through a graph. You started on all "A" nodes and you had to find the first time at which you were entirely on "Z" nodes.

The intended solution was seemingly to calculate the length of the cycle (as in, until you were on a "Z" node) for each starting node separately. Once you'd done this, you could find the LCM of these cycle lengths to find the overall period of the cycle (~10^13).

This solution might not work if your cycles weren't constant length -- for example imagine when walking your graph you found your i_th Z-node after {6, 7, 6, 7, 6, 7, 6, 7 ...} steps; or perhaps {6, 7, 7, 7, ...}. In the test data, this didn't occur - it was always {n, n, n, n, ...}.

I can give another example perhaps - consider you're asked to find the last 3 digits of 2^n for n >= 1. You start off calculating: 002, 004, 008, ... eventually you get to 2^103 which ends in 008. This means that the cycle length would be {103, 101, 101, 101, 101, ...} since it'll never get back to 002 or 004. Solving this is a bit more difficult than the constant cycle lengths, since it's not a simple LCM.


Is there any theorem that puts limits on how irregular the cycles can be in a general case of tape length L with N options corresponding to N outputs from each graph node?

What has to be constant is the cycle length for (node, tape instruction #) pairs, because all the state to determine every future step is based on the current node and tape position; if both are the same, the path will be the same as before. I think, at least without advanced math, the only thing to rely on for "true" loops is identical pairs of (instr #, node), not just instr # or tape steps or node individually.


I haven't spent that much time thinking about it, but my guess is that there may or may not be:

* a "tail" -- some initial part of the path before you get into a true cycle (like the 002, 004 in the example above)

* an inner repeating cycle before you reach a node you've seen before.

In the case given |t| = 0 and |c| = 1, but it's easy to construct a more complex example with nodes (A, B, C, D), edges (A->B, B->C, C->D, D->B), and 'ending nodes' being B and D. In this case left and right paths go to the same node. This case would have a tail of length 1, and then the inner cycles would be of length {2, 1, 2, 1 ...}.

As a result, valid 'ending states' (Z-nodes) for this graph would be after {1, 3, 4, 6, 7, 9, 10, ...} steps.


You can have an inner seemingly-repeating cycle by node, but not by (node, tape instr #) pair.

Initial tails are something they should've done, which would've foiled the naive "find the cycle lengths and use lcm" approach.


I can see the case of an aperiodic first cycle length, but given that each vertex in the graph created by the input -> output nodes of the traversal path has only one outgoing edge, is it possible for the cyclic graph from a1 to some z1 to have an inconsistent length? And is there ever going to be some z2 for which the aperiodic cycle length from a1 to either z1 or z2 results in a better answer to the problem given that the periodic cycle length of a1 to z1 is shorter than the cycle length of a1 to z2? Please forgive me if I am not using the correct graph theory terms.


I don't necessarily mind these "reverse engineering" type problems, I think you can consider your input part of the problem statement. In earlier years it was common, particularly in early days, for your input to be on the page itself IIRC, like "Your input is 103053439" rather than a link.

One thing that bothers me more though is when the example input is nonsense for Part II or can be solved but in a radically different way because it has very different properties than the inputs people are given for real. Contrast day 19, where the example input has a rational answer, which you are told about for the Part II, against day 20 where the example input is completely irrelevant for Part II and good luck.


Yeah, it's always rough when you write a solution for part 2, and it passes all the samples, but then has issues with the real input.

One thing I've tried to do when I run into those kinds of problems is to write a little benchmark to illustrate the difference between approaches. It's always kinda fun to watch your initial brute force solution start chugging while your shiny new solution seems to handle whatever you toss at it - great lesson in choosing effective algorithms.

I use Elixir to do the puzzles, so I've used Benchee for this, and it works very well. So easy to set up too.

Here's an example benchmark, which also has the output. As the input size increases you start getting some pretty crazy ratios between the two algorithms (the "smart" version was 200,000 times faster than the "naive" one on the largest test input!)

https://github.com/epiccoleman/advent_of_code_ex/blob/master...


> I think you can consider your input part of the problem statement

That would be fine if there was only one input. But we have seen countless cases of a solution working for some and not others (including/especially the test input). So the cases and patterns I see in my input may be part of the hidden problem statement, or they may be just an artifact of my particular input. I at least feel more satisfied when I know my solution will solve all inputs.

For that reason, I side with the desire for a bit more purity and closure, vs having to guess, and not being sure about your guess until you go to reddit. Making educated guesses about patterns is a valuable ability, but I don't like the aesthetics of it here.


This is definitely something where tastes vary, for example you can see below several people assumed you can multiply any numbers, and that actually worked because the numbers are co-prime, but strictly you need LCM (and since I had LCM in my toolkit I used it)

I try to write solutions which would work for inputs like mine but I'm not fussed if, for example, my code panics on hypothetical "valid" inputs that I didn't consider.

So e.g I have panics for nonsense input like, "Pipe networks with more than one Start can't be solved" and "Uneveen seed list cannot work as described in problem" [Yes that's a typo] but I also have "Should not send signals unexpectedly" and "Surely not all the stones have X velocity 0"


I like them a lot because they inject a little bit of empirical or heuristic thinking into problems. For this particular problem I immediately reached for graphviz to get an idea of what the circuit looks like and it made it obvious what the solution was.

It's much closer to how real world engineering problems work where you have to do a little bit of investigation and maybe find some creative way to constrain your problem that nobody explicitly tells you about. Much prefer it to the 'here are the exact five keywords so you know what CS class algorithm you need to use' kind of thing.


Yes, yes, yes for graphviz.

I _love_ when one of these problems encourages me to reach for a tool like that. (At least, when it's one I'm already aware of and I have time to dedicate to grokking and solving the puzzle).

I used graphviz on one of last years problems and found it pretty helpful. It just feels so cool when you learn a new "spell" to help you feel out the problem.

Another example was from 2020, where I used lexx/yacc to generate a parser for the expressions in the input. This was honestly probably slower than just doing it myself but it felt so cool to solve it with a tool - and it's neat to plant little signposts in your brain that will light up when you run across a similar class of problem in your real work.

Another thing I often do is use vim to munge the input and save myself some parsing code, which is a very generically useful branch of "magic" to have under your fingers.


First I started by modelling the devices as objects. Starting with a single base class that has most of the common behaviour.

Object oriented programming has ruined us


To expand on this: The issue is that OOP easily leads you to model your problem into your solution. This has the desired effect of solving the problem, but the likely undesired side effect that your solution encodes a specific problem description.

Thats why this is an issue


I think the problem is not OOP - the author approached the problem bottom-up, they were trying to foresee usage for code they were writing. The resulting code is a bit of a mess, not OOP (there’s even instanceof) Instead we should start with the usage, with the code creating the value and then filling in the details. In this problem, if we choose to simulate the circuit I would start with the simulator code, introducing abstractions for components only if it would help the simulator.

I liked the demonstration of this approach in Chapter 6 of Robert C. Martin’s Agile Software Development: Principles, Patterns and Practices. It’s available online here:

https://people.scs.carleton.ca/~jeanpier//Fall2021/Topic%201...


There are a few problems with OOP. The one that bothers me the most is that there is immediately boilerplate complexity. How is this bag of data addressed? What does it behave like? How is each piece of it read and written? Now you need collection generics. There’s complexity everywhere, and it’s expensive. It hampers concurrency, and doesn’t work on GPUs. What do we get for it? Implicit control flow, and it isn’t even clear to me that this is a benefit let alone worth the costs. Any time performance is remotely a concern, object orientation should be the first thing removed.


Yeah. Generally the best code in OOP languages tends to favour composition over inheritance. In other words, it uses functional ideas and works around some of the clunkiness of OOP to build generic structures and algorithms. This is what really successful packages in OOP languages look like (such as numpy and friends).


What ruins folks these days? Functional?


Rust


Everything rusts’ eventually


My favourite solution to this problem was going all in on analyzing the input. Instead of just assuming the set of modules must have cyclic behaviour and running the simulation until you find the periods, look at the input and _really_ understand what its doing.

What you will find is that the modules form a set of binary counters (chains of flip flops), with a number encoded into them via whether they are connected to a "hub" node of the chain (a conjunction).

You can parse the module structure and traverse the graph to extract that number. The connections to the hub are the bits of the number (1 if module is connected, 0 if not). Do that for all the counters and LCM (or multiply since they are all coincidentally co-prime) them together to get your answer.

No simulation required.


Yup. This is a rare pen and paper solution for me. I ran simulations to confirm the first two cycles and that I was reading the binary right, but I got the solution directly from the input by hand.

It is my favorite problem of AoC as well this year mostly because it was the first problem in many years where when part 2 popped I didn't immediately mostly know what algorithm they were going for.


I feel like the quality of puzzles has fallen over the years. This year had much more “spot something in the real input which neither the examples nor the puzzle itself states” than any previous I participated in. I really don’t enjoy those sorts of puzzles and, at least for me, it isn’t what I’m participating in AoC for.


Correct.

Part 1: easy.

Apply part 1 solution to example, works.

Apply part 1 solution to real input, works.

Part 2: hard to very hard.

Apply part 2 solution to example, usually works.

Apply part 2 solution to real input, doesn't work due to time / memory constraints.

But having to look at the input to discover a pattern that isn't explained as part of the instructions always irks me.


I didn't enjoy the puzzles as much as previous years. My guess was they were structured in a way to make them non-trivial to solve with chat-GPT as the leaderboard is taken seriously by many.


Lots of people have speculated that this is true, but the creator of the puzzles (Topaz) has indicated that they did not concern themselves with Chat GPT or LLMs beyond specifying that if you do just use an LLM you should not attempt to claim top leaderboard spots which are for humans only.

In fact I didn't see people wrestling with LLMs trying to get them to solve early days and I can't believe it's impossible. My expectation is that the fad has passed, the sort of people who tried Chat GPT in 2022 because it was hot have moved on, the sort of people who resorted to it because they don't like AoC just didn't do AoC in 2023. If you enjoy these puzzles it makes sense for you to solve them, not to ask a machine to do it.

That's what many people who resorted to Z3 found frustrating. A Z3 solution to Day 24 is arguably "correct" and it certainly "works" but it's not very satisfying in terms of feeling like you achieved anything. This is why I wrote a solution which didn't do that even though it was days slower to write.


As a small counterpoint early on I do remember people trying and failing to use LLMs which prompted some of the speculation about the competition being made LLM-proof. As early as I think day 2 there was a big discussion about how no matter what people did the SotA LLMs could not give a correct solution for part 2. After that at least on the subreddit discussions involving LLMs were downvoted intentionally and so while presumably people were doing it there wasn't any discussion on Reddit at least.

I kind of agree with your last point. I try to do all the problems without non-built in python packages myself for exactly the reason you describe. Just feels more satisfying to me.


Interesting about the LLMs. I admit I don't read very much early AoC Reddit because I am most likely to be reading if I have questions after a few hours (like, wait, are these huge numbers prime or am I bad at arithmetic? Does my approach work and I screwed up, or am I an idiot and I'm on the wrong track entirely?) and in the first ten days or so I often don't have any questions.


Would you rather they re-skin the same puzzles people have memorized?


I’d rather they either give the assumptions in the puzzle description or at least in the examples. There were a few puzzles which I didn’t enjoy due to personal reasons but those I don’t have anything against, so long as the problem doesn’t boil down to „figure it out based on the real input since the example input doesn’t reflect it”


One thing to note is that periodicity need not all be prime numbers, so you can't always find the product of their periods -- you might need to find their LCM.


> But if we multiply the numbers together we get a number that is divisible by every number in the table.

Wouldn’t LCM be the correct/more general approach? The periods could have common factors, right?


That's correct. In practice it appears that the AoC inputs provide numbers which are always co-prime (ie have no common factors other than 1).


Do you know if there were any inputs on this problem that had non-prime numbers? Like others, I used LCM in my code, but mine were all prime numbers.


I don't know, presumably Topaz (who creates AoC) would know, but isn't telling.

We would probably find out if some inputs aren't co-prime because the naive multiplying solution breaks, but they could be non-prime and yet co-prime, for example 15, 14, 11, 23 is a set of numbers which are co-prime, but neither 15 nor 14 are prime.


I had a difficult time with both part 1&2 of Day 20. At first I just mentally modeled the circuit as "clock-based" where the Flip-Flops flip only when all previous pulses are processed and a synchronizing clock signal is given,

e.g. a flip-flop (default to low) with 2 input ports, on first clock signal (time 1) both ports receives a low pulse; second clock signal (time 2) the flip-flop inspects 2 inputs and decides to give a low (two flips) at the end.

Frustratingly, this model passes the 2 examples flawlessly but fails for the real input.


I was half expecting someone to model this in VHDL and to synthesize it in a FPGA to get the answer to part 2


I regret that I cannot read this article. At least not for another day or so... I only just finished #19.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: