llvm bit code binding use case


I have a use case scenario where I need to compare equivalent function definitions of a common set of code that produced llvm bitcodes in two separate ways. By compare, I do not mean comparing programs in the most general sense, as I know that that would be naive and problematic, but I want to find identical pointer usage of from different expressions. Bruting the space of all possible combinations for the two after they have been normalized is a feasible solution, because def use chains tend to be rather small, and equivalence is either by identical expressions or not. Constant folding is one way to achieve this, by tagging the ssa value associated with a particular property.

So my questions are:

Is there a corpora of known source code targets for which I can compile to both llvm bitcode and executable on linux easily?

Is there an api and tool whereby I can compile a llvm plugin or utility that will load a bitcode, transform it to the appropriate data structures and LLVM IR, and feed that IR to my plugin or utility according to some traversal or simple brute space by command line? As in:

load-bitcode ./myplugin.so – bitcodefile.bitcode

Is there an easy way to perform constant folding on a given LLVM IR SSA value?

Is there an existing data structure that allows storing LLVM IR in a Set or Map type?