Google's perf tools used to support stack ranking and layering. I used to love it...because it was completely open ended. You could stack rank on any dimension you wanted to, and you didn't even have to say what the dimension was. You could stack rack anybody, over arbitrary sets. AFAIK the results were only ever used in aggregate -- without the dimensions there really wasn't any other way to use it.
You could stack rank how nice you thought people were, making two layers -- like "really nice" and "mostly nice" (of course, there were no labels in the tool). Or you could use it to highlight two particular team-mates who really stood out by making a layer for them, and then a layer for everyone else.
It could also be used to stack-rank the entire management chain, which was fun.
Any manager at Google who quotes "lines of code" in any context other than actively defending someone or noting something truly exceptional is doing so because they have no idea what their report actually does, or whether it matters.
A real PIP always has HR involved. Note to any Googlers: If a manager tells you you're on a PIP, feel free to reach out directly to HR and ask for information about it. Or your skip level, who will be pretty surprised if it's not a real one, and probably not too happy about it.
On your second example, this is where having senior technical people in the calibration room is really important. The key there is ensuring that said technical people know about the work before they are in the room. It is really difficult to change outcomes after the fact. You can't expect them to defend something they don't know about.
You could stack rank how nice you thought people were, making two layers -- like "really nice" and "mostly nice" (of course, there were no labels in the tool). Or you could use it to highlight two particular team-mates who really stood out by making a layer for them, and then a layer for everyone else.
It could also be used to stack-rank the entire management chain, which was fun.