VC++ linking issues, revisited

I've gone about as far as I can in building executables with VC++. The problem with the remaining ones is that they rely on the static constructor trick to register various modules. This doesn't work with VC++ because without an explicit external reference to these modules they simply can't be linked in to an executable.

This isn't a new problem, of course. Morten originally ran into this getting the X86 backend to link in, and solved it by introducing a global variable that could be used as the external reference. The problem is, this doesn't scale. There are few code generator targets, and fewer still that one would care to use on Windows. But there are dozens of optimizations and analyses. It's not practical or maintainable to give each one a global variable and then reference it from each affected executable.

So I can (and have, actually) build "opt", but it's just a big waste of bytes as it has no optimizations available to it. And if I understand things correctly, it means that the JIT can't do any optimizations either.

I'm not really sure how to deal with this. The best solution I can come up with is to put all of these modules into DLLs. When a DLL is loaded, all of its static constructors are executed, regardless of which modules are externally referenced. Nonetheless, there must be at least *one* external reference, or else the DLL wouldn't be loaded automatically in the first place. The DLL could be manually loaded, but that would be introducing Windows-specific code in places you probably don't want it. However, one global (or dummy function) for all optimizations or all code generator targets or all analyses is much better than one for each optimization or target or analysis.

I think this will work, but it does represent a major change in how the VC++ build is conducted and I want to get feedback first, especially from Morten.

Jeff,

There should be a way to do what we do with the Unix Makefiles and build
re-linked object modules. That is, when we build an analysis or
transform pass, we create two things: a .o file and a .a file. They
contain the same code but the latter is searchable while the former is
not.

Can you not "pre-link" a bunch of .obj files together with VC++ to
produce a new .obj file? And, when linking something like opt, will it
not just put all .obj files that you specify into the executable? I
think this is the best approach as it avoids some slowness in start up
of the tool if the equivalent DLL approach was taken.

Reid.

No, VC++ has no way to combine multiple .obj files into one. Nor is there any way to force the entire contents of a .lib file into an executable. Believe me, I looked for a way. Morten couldn't find one either. Even Microsoft's command line tools can't do it. Advantage: GNU.

DLLs aren't that slow any more. Windows is so dependent on DLLs (the Win32 API itself is implemented as DLLs) that a lot of effort has gone into optimizing them. You wouldn't believe the number of DLLs that are present in a process for a program that doesn't have any DLLs of its own. I even found the VC++ 6.0 runtime DLL in LLVM processes, and I haven't a clue how *that* got loaded (must be used by some system-supplied DLL).

Reid Spencer wrote:

OK, there may be some light at the end of the tunnel. I *can* force an arbitrary .obj file to become part of the executable, one that is not part of the executable's project. This is sufficient to eliminate the global variable hack Morten introduced for the X86 target.

But this still doesn't scale very well, as I'd have to manually enumerate all .objs that are transforms and insert this list into every project that builds an executable that needs them.

But maybe if I got *very* clever... maybe too clever: I add a post-build event to project Transforms that lists the .obj files that now exist for the project and turn that into a response file that's supplied to the exe link steps (if I can actually supply a response file in VS...). And do it without requiring Perl or Python or whatever that may not be available. Well, it's worth a try...

Jeff Cohen wrote:

No, VC++ has no way to combine multiple .obj files into one. Nor is there any way to force the entire contents of a .lib file into an executable. Believe me, I looked for a way. Morten couldn't find one either. Even Microsoft's command line tools can't do it. Advantage: GNU.

DLLs aren't that slow any more. Windows is so dependent on DLLs (the Win32 API itself is implemented as DLLs) that a lot of effort has gone into optimizing them. You wouldn't believe the number of DLLs that are present in a process for a program that doesn't have any DLLs of its own. I even found the VC++ 6.0 runtime DLL in LLVM processes, and I haven't a clue how *that* got loaded (must be used by some system-supplied DLL).

Why not just build the optimization libraries as DLLs then?

-Chris

Jeff,

There should be a way to do what we do with the Unix Makefiles and build
re-linked object modules. That is, when we build an analysis or
transform pass, we create two things: a .o file and a .a file. They
contain the same code but the latter is searchable while the former is
not.

Can you not "pre-link" a bunch of .obj files together with VC++ to
produce a new .obj file? And, when linking something like opt, will it
not just put all .obj files that you specify into the executable? I
think this is the best approach as it avoids some slowness in start up
of the tool if the equivalent DLL approach was taken.

Reid.

I've gone about as far as I can in building executables with VC++. The problem with the remaining ones is that they rely on the static constructor trick to register various modules. This doesn't work with VC++ because without an explicit external reference to these modules they simply can't be linked in to an executable.

This isn't a new problem, of course. Morten originally ran into this getting the X86 backend to link in, and solved it by introducing a global variable that could be used as the external reference. The problem is, this doesn't scale. There are few code generator targets, and fewer still that one would care to use on Windows. But there are dozens of optimizations and analyses. It's not practical or maintainable to give each one a global variable and then reference it from each affected executable.

So I can (and have, actually) build "opt", but it's just a big waste of bytes as it has no optimizations available to it. And if I understand things correctly, it means that the JIT can't do any optimizations either.

I'm not really sure how to deal with this. The best solution I can come up with is to put all of these modules into DLLs. When a DLL is loaded, all of its static constructors are executed, regardless of which modules are externally referenced. Nonetheless, there must be at least *one* external reference, or else the DLL wouldn't be loaded automatically in the first place. The DLL could be manually loaded, but that would be introducing Windows-specific code in places you probably don't want it. However, one global (or dummy function) for all optimizations or all code generator targets or all analyses is much better than one for each optimization or target or analysis.

I think this will work, but it does represent a major change in how the VC++ build is conducted and I want to get feedback first, especially from Morten.

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev

-Chris

Jeff Cohen wrote:

OK, there may be some light at the end of the tunnel. I *can* force an arbitrary .obj file to become part of the executable, one that is not part of the executable's project. This is sufficient to eliminate the global variable hack Morten introduced for the X86 target.

But this still doesn't scale very well, as I'd have to manually enumerate all .objs that are transforms and insert this list into every project that builds an executable that needs them.

But maybe if I got *very* clever... maybe too clever: I add a post-build event to project Transforms that lists the .obj files that now exist for the project and turn that into a response file that's supplied to the exe link steps (if I can actually supply a response file in VS...). And do it without requiring Perl or Python or whatever that may not be available. Well, it's worth a try...

The light was an oncoming train. Response files cannot be used within Visual Studio for a stupid technical reason that'd take some Microsoftie 30 minutes at most to fix if Microsoft bothered. Manually forcing each of the 73 object files in Transforms to link not only doesn't scale and would be a pain to maintain, but it doesn't work. The list is too long for VS to handle and it gets truncated, without warning, and in the middle of a file name, while creating the linker response file.

So there's only two ways of dealing with this. The first is to use DLLs. To prevent code from being duplicated in multiple DLLs and the EXE, the bulk of the code in lib/ will have to go in DLLs. To keep the number of DLLs reasonable, many projects will have to be collapsed into one. There will be one DLL (and project) each for the transforms, the analyses, and the targets, and one catchall DLL for all common dependencies of the other three.

The other way is to stick a unique global variable in each transform and analysis, create one header file that uses the globals in all the transforms, and likewise for the analyses, and include that header file from any executable that needs its services. There are few enough targets that using forced dependencies works, though this approach can be used here as well for consistency.

It's a safe bet the LLVM community really dislikes the second approach. But there's no guarantee the first approach can actually be made to work. The only other option is to use makefiles to do builds and use VS only for debugging, something which would be really disliked by the Windows software development community.

Hello,

I've been away working on other things so I have not been able to respond to mails on this mailing list for a while. But today I got back and I got the latest version of LLVM from CVS and built it successfully (after I deleted my old config.h which was included instead of the one in the new location)...

Jeff Cohen wrote:

OK, there may be some light at the end of the tunnel. I *can* force an arbitrary .obj file to become part of the executable, one that is not part of the executable's project. This is sufficient to eliminate the global variable hack Morten introduced for the X86 target.

However I can no longer link it with our main project since the X86TargetMachine symbol is gone, and the intermediate object files are not available to the other project. I can't easily do the same as the fibonacci example does, so I have to put it back to the way it was.

But this still doesn't scale very well, as I'd have to manually enumerate all .objs that are transforms and insert this list into every project that builds an executable that needs them.

I only have the problem with the X86TargetMachine because in the case of the optimization passes I explicitly call the createXXXPass functions. 'opt' creates passes by name instead and that's why it gets into trouble.

So there's only two ways of dealing with this. The first is to use DLLs. To prevent code from being duplicated in multiple DLLs and the EXE, the bulk of the code in lib/ will have to go in DLLs. To keep the number of DLLs reasonable, many projects will have to be collapsed into one. There will be one DLL (and project) each for the transforms, the analyses, and the targets, and one catchall DLL for all common dependencies of the other three.

I looked at this, but the problem is that DLLs need explicit __declspec( dllimport) and __declspec(dllexport) on all symbols that are to be exported and imported. I think adding the declarators to all the symbols that are exported/imported is a lot of work and also pollutes the source for the Unix versions.

The other way is to stick a unique global variable in each transform and analysis, create one header file that uses the globals in all the transforms, and likewise for the analyses, and include that header file from any executable that needs its services. There are few enough targets that using forced dependencies works, though this approach can be used here as well for consistency.

I think this solution is quite OK considering almost all the modules already have such global symbols, namely the createXXXPass functions.

m.

Morten Ofstad wrote:

Jeff Cohen wrote:

OK, there may be some light at the end of the tunnel. I *can* force an arbitrary .obj file to become part of the executable, one that is not part of the executable's project. This is sufficient to eliminate the global variable hack Morten introduced for the X86 target.

However I can no longer link it with our main project since the X86TargetMachine symbol is gone, and the intermediate object files are not available to the other project. I can't easily do the same as the fibonacci example does, so I have to put it back to the way it was.

Ouch... sorry about that.

But this still doesn't scale very well, as I'd have to manually enumerate all .objs that are transforms and insert this list into every project that builds an executable that needs them.

I only have the problem with the X86TargetMachine because in the case of the optimization passes I explicitly call the createXXXPass functions. 'opt' creates passes by name instead and that's why it gets into trouble.

So there's only two ways of dealing with this. The first is to use DLLs. To prevent code from being duplicated in multiple DLLs and the EXE, the bulk of the code in lib/ will have to go in DLLs. To keep the number of DLLs reasonable, many projects will have to be collapsed into one. There will be one DLL (and project) each for the transforms, the analyses, and the targets, and one catchall DLL for all common dependencies of the other three.

I looked at this, but the problem is that DLLs need explicit __declspec( dllimport) and __declspec(dllexport) on all symbols that are to be exported and imported. I think adding the declarators to all the symbols that are exported/imported is a lot of work and also pollutes the source for the Unix versions.

Ouch again... you're right of course. The DLL approach won't work.

The other way is to stick a unique global variable in each transform and analysis, create one header file that uses the globals in all the transforms, and likewise for the analyses, and include that header file from any executable that needs its services. There are few enough targets that using forced dependencies works, though this approach can be used here as well for consistency.

I think this solution is quite OK considering almost all the modules already have such global symbols, namely the createXXXPass functions.

That's a good point. The globals already exist (in the form of functions). The header files simply need dummy code to "use" them, and in such a way that the VC++ optimizer doesn't think it's dead code. A static constructor can do the job. Still need to include this header in every file with a main() function.

Hello,

Welcome back,

I think this solution is quite OK considering almost all the modules already have such global symbols, namely the createXXXPass functions.

If you wanted to add a createXXX symbol for each pass that does not already have one, feel free to do so.

-Chris