Given a specific target, and a set of rules that define what is optimal code for that target,

One doesn't need the notion of what is optimal in order to achieve this.

can it be proven that the LLVM IR provides the necessary and sufficient abstractions from which optimal code can be generated?

By the phrase optimal code `can' be generated, I'll assume you mean, you can add an oracle pass that `just knows' what optimal code is to llvm.

Sure, as long as the IR includes a .byte escape, the proof is trivial. llvm includes such a construct, so trivially, it can be used to formulate optimal code. I'd call this the degenerate case, and I realize you're probably not interested in this case, but, there it is. Beyond that, in the general case, I'd expect the answer is no. If you want to limit the question to a specific machine, with certain limitations, one can engineer the answer to be yes. From a theoretic perspective, I'd call that cheating. But, if your willing to limit things, one just needs to show a 1 to 1 mapping of constructs in the IR to instructions and show that every instruction has such a mapping.

Now, you ask, what would such limitations looks like. Turn the CS page back 20 years and check out self modifying code and laying out code to encode meaning into the addresses used by that code for examples of those things that most modern compilers take no advantage of. llvm cannot in general encode those sorts of details in its IR and reason about them, and I suspect a proof showing the existence of such a machine requiring such things for optimal code would be trivial enough.

If I "raise" optimized assembly language code into LLVM IR, is it

possible (thoeretically, provably) to generate equal, or more, optimal

assembly code from it?

Trivially, yes. You generate .byte encodings for the entire thing, and upon output, you'd get the same answer back, so it would have equal or more optimal assembly code. Proving you can get more optimal code out of it, in the general case, is trivially impossible, just consider the case when the optimal code is given as input.