Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Maybe we then need an LLM to tell us if two pieces of compiled code are equivalent in an input-output mapping sense (ignoring execution time).

I'm actually serious; it would be exceedingly easy to get training data for this just by running the same source code through a bunch of different compiler versions and optimization flags.



An LLM cannot do this. I don’t even mean this in a formal sense, because your problem is addressed by Rice’s Theorem, which places bounds on what any system (LLM or not) can do here; I mean it in the sense that an LLM isn’t even appropriate to use here because the best it can possibly do is provide you with its best guess at the answer. And while this might be a useful property for decompilation in general that’s not what was being discussed here.


Rice's theorem does NOT prevent a program from giving correct answers to non-trivial properties of programs (including the halting problem or other undecidable problems) for 99.99% of inputs and "I don't know" for 0.01% of inputs. It only states that you cannot write a program that provides a correct and definitive yes-or-no for 100% of inputs.

For a decompiler, being able to decompile even 90% of programs would be awesome. We're not looking for theoretical perfectness.


Why would an llm be the tool for that job?


Without analytical thinking how else would you come to conviction that two functions are identical, for a computationally unfeasible number of possible inputs?


Formal logic / formal proofs. We have good systems for verifying that.

The proper flow is that you use LLM to generate decompilation steps, along with potential proofs, and then use old algorithms from 1970s that verify that the steps are correct.

Source: I built a decompiler for EVM, arguably the best one on the market, and to some extent it was how it worked (and others comparable in class).

The issue was always the exploration of possible transformations of code, once you manage to find the right ones (which LLMs can propose way better than old hard coded rules and SMT solvers), it's simple to verify that the transformations are correct.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: