Adding new a new type

I was reading: https://llvm.org/docs/ExtendingLLVM.html

And am heeding the warnings that come with new (derived) types.

I’m trying to use LLVM to model chemicals. More specifically, there are several reactive groups that exist: salts, bases, acids, etc. that adequately represent their respective values. I, for obvious reasons, want to manifest those types in LLVM so I can type check a given input and use LLVM for all my compilation needs.

I’m uncertain what I should do with respect to LLVM and the guidance the documentation provides for adding new types.

Thanks!

You almost certainly don't want a new type in that sense. Try https://llvm.org/docs/LangRef.html#structure-type instead.

-Eli

Ok, so I can use the structure-type to represent base or acid or salts. My next question is how union-typing works in LLVM. After reading a bit about, it’s unclear to me if it is a “thing” anymore; my intuition – and poking around in the sources is that it’s a “no”, it’s been deprecated.

Which then means that it doesn’t much matter about typing LLVM. I can pass that up to a python script and handle typing up there.

I just want to make sure this is accurate.

Ok, so I can use the structure-type to represent base or acid or salts. My next question is how union-typing works in LLVM. After reading a bit about, it's unclear to me if it is a "thing" anymore; my intuition -- and poking around in the sources is that it's a "no", it's been deprecated.

No, LLVM doesn't have unions; you just bitcast pointers between different struct types. (In general, if you're having trouble understanding how to lower something to LLVM IR, it's often useful to use "clang -S -emit-llvm" to see what clang emits for a simple C testcase.)

Which then means that it doesn't much matter about typing LLVM. I can pass that up to a python script and handle typing up there.

Yes, you should only use LLVM types as far as they're actually helpful.

-Eli

Silly question … what do you expect to do driving LLVM directly that you couldn’t do by generating C and compiling that?

It’s a lot easier.

We have a natural-language DSL for a research project I’m on that we had our own custom compiler for; but quickly out grew. Instead of reinventing the wheel and implementing all the overhead for register allocation, stack frame generation, retargetability, etc., we wanted to use LLVM to do all the heavy lifting for us.

I actually have no problem with handling typing in a layer above LLVM. It works out nicely as 2 other projects I’m on, that are closely related, require it and need the type checker anyways. And we rely on type inference in most cases, the typing is completely optional. So in LLVM I can type all materials (acids, bases, etc) as one of the base types in LLVM and then just pass up a json representation of variable interaction and run type inference on that. From what I’ve read z3 is much easier to use in python than C++ anyways.

Overall, I think this approach is much easier, I was just curious.

Thanks!