reading a module from a memory string (BitCode)

Hello,

with the latest LLVM (almost 2.0 CVS) what is the right way to read a module
from a byte array fetched from a database?

I thought that I could subclass llbm::module to add my own fields
(typically, a MySQL id number) and then parse it as bitcode, but I am stuck,
since apparently the only way to parse bitcode is to use a BitcodeReader
then calling materializeModule gives a fresh llvm Module (not my subclass)

As a general question, are Llvm classes supposed to be usually subclassed to
add application data (like my modtime and id), or not...

Does any one have some example of reading Bitcode encoded modules? Can I
assume that such modules are not tied to a particular (LLVM target)
architecture, in other words can I store into my database modules build on
AMD64 and reload them on x86 (32 bits)?

Regards.

Hi Basile,

Hello,

with the latest LLVM (almost 2.0 CVS) what is the right way to read a module
from a byte array fetched from a database?

Add your own blocks

I thought that I could subclass llbm::module to add my own fields
(typically, a MySQL id number) and then parse it as bitcode, but I am stuck,
since apparently the only way to parse bitcode is to use a BitcodeReader
then calling materializeModule gives a fresh llvm Module (not my subclass)

LLVM only knows how to read LLVM stuff. If you want to "add" stuff, you
need to do it in a separate block. The bitstream format is flexible
enough to support reading a new file format that is LLVM+your_stuff.

As a general question, are Llvm classes supposed to be usually subclassed to
add application data (like my modtime and id), or not...

No, not generally.

What is it that you're trying to do? Sounds like maybe you're trying to
do "front end" type things. The HLVM front end libraries will assist
with many of these kinds of things (associating source line, language
identifier, managled identifier, etc. with corresponding LLVM nodes).
The general approach is to keep references back to the LLVM objects from
another block. None of this exists yet in HLVM, but it will this summer.

Does any one have some example of reading Bitcode encoded modules?

Bitcode is only a few weeks old, so I doubt it :slight_smile:

Can I assume that such modules are not tied to a particular (LLVM target)
architecture, in other words can I store into my database modules build on
AMD64 and reload them on x86 (32 bits)?

Depends on whether you mean "bitcode file format" or "LLVM module". A
module can definitely be tied to a specific target. It completely
depends on what the front end language compiler generates. From a
bitcode file format perspective, the files should be portable as they
are treated as a stream of bits without concern for endianness, etc.

Your example will work from a file format perspective (you'll get the
same bits out as you got in and they'll have the same interpretation on
the x86 as on the AMD64). But, that doesn't mean it will execute
correctly. If the module has no inline assembly or any other target
specific features (like differences in alignment) then it could work.
Your front end will determine how well this works. If the front end is
llvm-gcc then the answer is "won't work, in general, but might work on
specific test cases".

Reid.

with the latest LLVM (almost 2.0 CVS) what is the right way to read a module
from a byte array fetched from a database?

The bitcode reader will read from any MemoryBuffer object. There are a variety of static methods on MemoryBuffer to create them from files, stdio, and memory. If your buffer is in memory, just create a memorybuffer and pass in the range of bytes already in memory. Once the appropriate MemoryBuffer is created, you load it like any other bitcode stream.

I thought that I could subclass llbm::module to add my own fields
(typically, a MySQL id number) and then parse it as bitcode, but I am stuck,
since apparently the only way to parse bitcode is to use a BitcodeReader
then calling materializeModule gives a fresh llvm Module (not my subclass)

Right, don't do that :). There are three easy ways to do this sort of thing: 1) add intrinsics to capture the information you want, 2) encode the information as global variable initializers (as we do with debug info) 3) store it out-of-band, as reid suggests.

As a general question, are Llvm classes supposed to be usually subclassed to
add application data (like my modtime and id), or not...

no.

Does any one have some example of reading Bitcode encoded modules? Can I
assume that such modules are not tied to a particular (LLVM target)
architecture, in other words can I store into my database modules build on
AMD64 and reload them on x86 (32 bits)?

LLVM IR is, in general, not portable if it comes from a non-portable source language like C.

-Chris

Le Sat, May 12, 2007 at 04:42:49PM -0700, Chris Lattner écrivait/wrote:

> with the latest LLVM (almost 2.0 CVS) what is the right way to read a module
> from a byte array fetched from a database?

The bitcode reader will read from any MemoryBuffer object. There are a
variety of static methods on MemoryBuffer to create them from files,
stdio, and memory. If your buffer is in memory, just create a
memorybuffer and pass in the range of bytes already in memory. Once the
appropriate MemoryBuffer is created, you load it like any other bitcode
stream.

Apparently BitcodeReader.h is only in lib/Bitcode/Reader/ but not in
include, so a make install does not install it.

Is it supposed to be accessible from applications? How exactly? I feel that
some install rule is missing; after a sudo make install,
  grep -rn BitcodeReader /usr/local/include/llvm/
don't find any occurrence! Is this a bug or a misunderstanding of mine?

Can I assume the bytecode interface is obsolete and to be replaced by bitcode?

BTW, my hoobby project is an hypertextual programming language experiment
with only a Web interface (no source files). So I want to store every
procuced llbm Module in a MySQL database.

Regards

Apparently BitcodeReader.h is only in lib/Bitcode/Reader/ but not in
include, so a make install does not install it.

I'm not sure what you mean... the header is in include/llvm/Bitcode.

Is it supposed to be accessible from applications? How exactly? I feel that
some install rule is missing; after a sudo make install,
grep -rn BitcodeReader /usr/local/include/llvm/
don't find any occurrence! Is this a bug or a misunderstanding of mine?

I think there is something strange going on :slight_smile: Try doing a full update from cvs.

Can I assume the bytecode interface is obsolete and to be replaced by bitcode?

Bytecode is completely gone now.

BTW, my hoobby project is an hypertextual programming language experiment
with only a Web interface (no source files). So I want to store every
procuced llbm Module in a MySQL database.

That shouldn't be a problem!

-Chris