I've been using SCons as the build tool for my frontend application, and I'm getting to the point where it would be useful to create a custom scanner for my generated bitcode files so that SCons can do proper dependency analysis. At the moment, SCons has no way to know which source files a particular bitcode file depends on, so the only way to do a "correct" build is to rebuild everything. Mostly I've been getting by with doing "incorrect" builds, meaning that I only build the files that actually changed. However, this has tripped me up one or two times and I'd like to solve it.
SCons has the ability to write custom scanners in Python, but I'm thinking that it will be easier in the long run to do this work in C++. So the idea would be to write a command-line tool that would spit out the list of dependencies, and then write a Python wrapper to call it from SCons.
My code generator keeps a list of what source files were imported during compilation, and at the moment what it does is it creates an internal-linkage array of strings, one string per import. The command-line tool can then load the bitcode file and read the string array. Since the strings are internally linked, and since nothing else in the bitcode file refers to them, they ought to be dropped during optimization (I hope.)
However, I notice that getting access to the strings from within the command-line tool is a little complicated, since I have to decompose the various constant getElementPtr expressions in order to get at the actual string. I'm wondering if there's a better way to represent this information. Maybe using the debug API, or perhaps annotation intrinsics?
For something equivalent to #include (where the dependencies are not fully specified on the command line), have you considered a solution like 'gcc -c -MF foo.d foo.c'? This spits out the dependencies into a file (foo.d) as a side-effect of compilation. The dependencies can be directly included into 'make' on subsequent runs. If the output is not present, it will necessarily be rebuilt. If the output is present, then sufficient dependencies will be listed in the file. (They may not be "up to date", but it doesn't matter if you think it through.) Granted that you're not using make, but the principal is perfectly sound.
For linkage command lines, lazily updating 'response files' (sorry, Windows terminology...) can provide a complete solution. Something like:
LINK_CMD := $(LINKER) $(LINK_INPUTS)
if [ "`cat $(INT_DIR)/link_cmd`" != "$(LINK_CMD)" ]; then echo "$(LINK_CMD)" > link_cmd; fi
output.so: link_cmd $(LINK_INPUTS)
Of course, these are complimentary depending on your needs. Using a response file for compilation could protect against changes to the include path which are not captured by the direct file dependencies. Using a depends file for a linker component could protect against files being added to the linker search path.