PATH and LD_LIBRARY_PATH

All,

With the pending reorganization of the software, I have some questions
about how developers set their PATH and LD_LIBRARY_PATH variables when
working with LLVM. This is a bit long winded, but bear with me.

We're planning to break the "llvm" module up into three modules:

      * support - lib/Support, lib/System, autoconf, make support,
        utilities
      * core - VMCore, Asm, Bitcode and the essential IR tools (llvm-as,
        etc.)
      * opt (not sure that's the final name) - everything else:
        Analysis, Transforms, CodeGen, Target, etc

Additionally, there are new modules such as "hlvm", "cfe",
"llvm-gcc-4.2" and undoubtedly more to come in the future.

We haven't decided the final architecture so don't quibble about what
goes in what module (yet). The point is, there will be several modules
instead of everything being in "llvm". With this situation we can no
longer just put llvm/Debug/bin in PATH and llvm/Debug/lib in
LD_LIBRARY_PATH and just have things work. Build products would be
places in the Debug/bin and Debug/lib directories for each module.

We'd like to better support multiple LLVM environments on a given
machine. An environment being "a directory and the associated
environment variable settings to support software development". I
personally have worked this way for about a year now. It helps keep
various works in progress separated. Obviously, having multiple
environments requires a certain amount of bookkeeping to keep things
straight. I use shell functions like "llvmco" and "llvmenv" to checkout
and switch to particular environments. The functions take care of the
details.

However, even with only a single checkout (environment) of llvm
software, there are details to be taken care of. We would like to
support this better, but the question is how.

Here are some of the issues:
      * On some platforms you set SHLIB_PATH or SHOBJ_PATH, etc.
      * With more modules the PATH and LD_LIBRARY_PATH become long (one
        entry per module). Having every module's Debug/bin in PATH and
        Debug/lib in LD_LIBRARY_PATH gets hard to maintain when there's
        multiple environments. Furthermore, the paths need to change
        when you switch to a release or release+asserts or release
        +expensive_checks build.
      * There are inter-dependencies between modules which may affect
        the relative ordering of the PATH and LD_LIBRARY_PATH component
        paths.
      * Building things can be affected because if you put the wrong
        directory in your LD_LIBRARY_PATH you can end up linking against
        libraries built by the compiler instead of your platform's
        native compiler, which will ultimately fail (very late too).
      * Having two llvm-gcc versions (4.0 and 4.2) in separate modules
        could lead to conflicts.
      * Upstream projects like hlvm and cfe will have several
        dependencies so getting the paths straight is important for
        successful building. Additionally, users will have their own
        project directories, at the top of the food chain, which are
        dependent on everything.
      * We want to treat each module, as much as possible, as a separate
        entity (very loose coupling), but they are API locked anyway and
        we can't do much about that. The dependencies are real.
      * There are utilities that we want in the paths (like llvm/utils)
        as well as utilities like TableGen that might eventually be
        needed across projects (e.g. "core" would need TableGen for the
        intrinsic functions but the module containing the targets also
        needs it).
      * Does every module need its own "llvm-config" program?
      * Some of us have multiple things going on at the same time and so
        work with multiple LLVM environments. For example, you could be
        working on an involved bug fix, your normal development work,
        quickie fixes, a branch for some side work, etc. In each of
        these cases you want a separate checkout and the associated
        environment variable settings for that directory. I call this an
        "environment". It is basically just a way to keep various works
        in progress separated. How can multiple environment be best
        supported?

So, the question is .. what do you want to do about all this?

Here are some options to be discussed:

     1. Punt - Let each developer/user figure this out for the
        themselves.
     2. Install - That is, set your PATH and LD_LIBRARY_PATH to one
        place and "make install" the build results into that directory.
     3. Shell - Provide some shell functions and aliases to manage
        setting the environment correctly. This could even use the
        ModuleInfo.txt file to glean dependencies. For example,the
        llvm-top module could have a "setenv.sh" scrip that is invoked
        with ". ./setenv.sh"to set the environment for whatever is
        checked out in that llvm-top. We'd need one for each type of
        shell and users would have to remember to run it.
     4. Other - got any other ideas?

I need help with #4 but I'm also looking for general feedback on solving the issues raised.

Reid.

One further issue I would like to raise:

    * At some point, some people (crazies like me?) are going to want
to install LLVM as the
      default and primary system compiler. Some of us may even want to
integrate this
      process into a distribution's package management. Anything that
makes this saner
      is appreciated, and it might not conflict with the already
mentioned serious concerns.

     2. Install - That is, set your PATH and LD_LIBRARY_PATH to one
        place and "make install" the build results into that directory.

I personally really like this, with some more or less involved
addition of the next one...

     3. Shell - Provide some shell functions and aliases to manage
        setting the environment correctly. This could even use the
        ModuleInfo.txt file to glean dependencies. For example,the
        llvm-top module could have a "setenv.sh" scrip that is invoked
        with ". ./setenv.sh"to set the environment for whatever is
        checked out in that llvm-top. We'd need one for each type of
        shell and users would have to remember to run it.
     4. Other - got any other ideas?

What about providing make install, with scripts to assist in setting
up appropriate ./configure prefixes, make installs, and environment
variables to utilize "staged" installs for each environment. This
would allow the behavior in these development "environments" mimic
very closely the potential installed "real world" behavior of the
packages. The root _is_ an environment, its just that we provide
scripts (or users provide scripts) which construct staged installs to
set up "environments" for on-going development. These scripts could
start very minimal, or not at all, and be built up over time without
hamstringing anyone's homegrown scripts (such as mine) which are based
entirely on the existing make install functionality.

For detecting installed libraries, I actually like module-specific
*-config scripts, or leveraging pkg-config (as ugly as it is) to do
module-specific stuff without code duplication. This allows flexible
and easy ./configure dependency resolution on installed packages, be
they installed in '/', or some staging location.

My 2 cents. =]
-Chandler

PS: I would be willing to contribute to seeing this smoothly work,
especially with the config scripts, and autoconf magic to enforce
dependencies.

We're planning to break the "llvm" module up into three modules:

What's the motivation for this? What does it provide that we don't have now?
Is there something broke that needs fixing?

     2. Install - That is, set your PATH and LD_LIBRARY_PATH to one
        place and "make install" the build results into that directory.

For me, this is by far the clear winner. It is the way nearly all free
software works. It's familiar and will take less time for people to learn.

                                       -Dave

With the pending reorganization of the software, I have some questions
about how developers set their PATH and LD_LIBRARY_PATH variables when
working with LLVM. This is a bit long winded, but bear with me.

ok :slight_smile:

We're planning to break the "llvm" module up into three modules:

     * support - lib/Support, lib/System, autoconf, make support,
       utilities

which utilities? The C++ programs in llvm/utils should not be moved.

     * core - VMCore, Asm, Bitcode and the essential IR tools (llvm-as,
       etc.)

I'm still not convinced that this is useful to split out from the rest of the LLVM tree, we should discuss this again after support is split out.

     * opt (not sure that's the final name) - everything else:
       Analysis, Transforms, CodeGen, Target, etc

Additionally, there are new modules such as "hlvm", "cfe",
"llvm-gcc-4.2" and undoubtedly more to come in the future.

Yep.

We haven't decided the final architecture so don't quibble about what
goes in what module (yet). The point is, there will be several modules
instead of everything being in "llvm". With this situation we can no
longer just put llvm/Debug/bin in PATH and llvm/Debug/lib in
LD_LIBRARY_PATH and just have things work. Build products would be
places in the Debug/bin and Debug/lib directories for each module.

Ok.

However, even with only a single checkout (environment) of llvm
software, there are details to be taken care of. We would like to
support this better, but the question is how.

Here are some of the issues:
     * On some platforms you set SHLIB_PATH or SHOBJ_PATH, etc.

This is up to the user to know what to do.

     * With more modules the PATH and LD_LIBRARY_PATH become long (one
       entry per module). Having every module's Debug/bin in PATH and
       Debug/lib in LD_LIBRARY_PATH gets hard to maintain when there's
       multiple environments.

Lets take a specific example, someone working on the clang front-end. For these people, they will check out support,llvm, and clang modules. If they *only* are playing with the front-end, and don't want to install, they just need to add the clang bin directory to their path. If they also want convenient access to the llvm tools, they can add that dir to their path. This seems reasonable to me.

Regardless of the users PATH setting, the build process for the various modules should invoke the tools from other modules *without* PATH needing to be set.

       Furthermore, the paths need to change
       when you switch to a release or release+asserts or release
       +expensive_checks build.

We have this problem today, it isn't a significant issue AFAICT.

     * There are inter-dependencies between modules which may affect
       the relative ordering of the PATH and LD_LIBRARY_PATH component
       paths.

This is only an issue if you have a name collision, right? IF so the answer is "don't do that" :slight_smile:

     * Building things can be affected because if you put the wrong
       directory in your LD_LIBRARY_PATH you can end up linking against
       libraries built by the compiler instead of your platform's
       native compiler, which will ultimately fail (very late too).

This is only for llvm-gcc?

     * Having two llvm-gcc versions (4.0 and 4.2) in separate modules
       could lead to conflicts.

The only thing that depends on llvm-gcc is the llvm-test suite. It's configure script should probably try to autodetect which C front-end you have (4.0,4.2, clang) and "build in" the paths it needs into its Makefile.config.

     * Upstream projects like hlvm and cfe will have several
       dependencies so getting the paths straight is important for
       successful building. Additionally, users will have their own
       project directories, at the top of the food chain, which are
       dependent on everything.

I don't see this. Building should just be a matter of typing the moral equivalent of "make". If clang used tblgen for its build, it would know to invoke it from the llvm module, and would use an absolute path generated by the makefile.

     * We want to treat each module, as much as possible, as a separate
       entity (very loose coupling), but they are API locked anyway and
       we can't do much about that. The dependencies are real.

Yep. The dependencies are hard dependencies, though it would also be nice to support "optional dependencies" down the line (if you check out "this" in your tree, it enables "that" feature in some other dependent module).

     * There are utilities that we want in the paths (like llvm/utils)
       as well as utilities like TableGen that might eventually be
       needed across projects (e.g. "core" would need TableGen for the
       intrinsic functions but the module containing the targets also
       needs it).

The makefiles that build the projects should not depend on PATH. The only need for PATH to be set is if the user wants to invoke something (like llvm-as, opt, etc).

     * Does every module need its own "llvm-config" program?

It would be nice if this was shared, perhaps to live in the support module?

     * Some of us have multiple things going on at the same time and so
       work with multiple LLVM environments. For example, you could be
       working on an involved bug fix, your normal development work,
       quickie fixes, a branch for some side work, etc. In each of
       these cases you want a separate checkout and the associated
       environment variable settings for that directory. I call this an
       "environment". It is basically just a way to keep various works
       in progress separated. How can multiple environment be best
       supported?

Wow, those people need to learn to work more incrementally ;-). j/k

It seems that they should just check out multiple trees and have scripts or something to set their PATH as appropriate... just like today.

I use this sort of thing when I have a frozen version of a tree for some project, and we need to backport a patch to that tree. In this case, I don't mess with my path at all, I just manually invoke utilities from that tree with absolute paths.

So, the question is .. what do you want to do about all this?

Here are some options to be discussed:

    1. Punt - Let each developer/user figure this out for the
       themselves.

This is the defacto answer until we get a solution :slight_smile:

    2. Install - That is, set your PATH and LD_LIBRARY_PATH to one
       place and "make install" the build results into that directory.

We *need* to support make install, but we also should not make it required. End users just want to 'check out/download + build + install', they don't want to mess with their environment or anything else for that matter.

    3. Shell - Provide some shell functions and aliases to manage
       setting the environment correctly. This could even use the
       ModuleInfo.txt file to glean dependencies. For example,the
       llvm-top module could have a "setenv.sh" scrip that is invoked
       with ". ./setenv.sh"to set the environment for whatever is
       checked out in that llvm-top. We'd need one for each type of
       shell and users would have to remember to run it.

Ick.

I need help with #4 but I'm also looking for general feedback on solving the issues raised.

To be clear, we're talking about LLVM developers here, not end users (who just use make install). I think LLVM devs can know to add a directory or two if they want convenient access to some llvm tool that gets built. Worse case they can use absolute paths if they want.

-Chris

> With the pending reorganization of the software, I have some questions
> about how developers set their PATH and LD_LIBRARY_PATH variables when
> working with LLVM. This is a bit long winded, but bear with me.

ok :slight_smile:

> We're planning to break the "llvm" module up into three modules:
>
> * support - lib/Support, lib/System, autoconf, make support,
> utilities

which utilities? The C++ programs in llvm/utils should not be moved.

No, I was thinking more like "mkpatch" and "llvmgrep".

> * core - VMCore, Asm, Bitcode and the essential IR tools (llvm-as,
> etc.)

I'm still not convinced that this is useful to split out from the rest of
the LLVM tree, we should discuss this again after support is split out.

Surely. About the only reason is for modules that only work with the IR
and don't want to have to compile all of the transforms and targets just
to use the IR. I know there's no technical difference, its merely a
developer convenience. In any event, we can discuss later. My point was
there will be more modules.

> * opt (not sure that's the final name) - everything else:
> Analysis, Transforms, CodeGen, Target, etc
>
> Additionally, there are new modules such as "hlvm", "cfe",
> "llvm-gcc-4.2" and undoubtedly more to come in the future.

Yep.

> We haven't decided the final architecture so don't quibble about what
> goes in what module (yet). The point is, there will be several modules
> instead of everything being in "llvm". With this situation we can no
> longer just put llvm/Debug/bin in PATH and llvm/Debug/lib in
> LD_LIBRARY_PATH and just have things work. Build products would be
> places in the Debug/bin and Debug/lib directories for each module.

Ok.

> However, even with only a single checkout (environment) of llvm
> software, there are details to be taken care of. We would like to
> support this better, but the question is how.
>
> Here are some of the issues:
> * On some platforms you set SHLIB_PATH or SHOBJ_PATH, etc.

This is up to the user to know what to do.

> * With more modules the PATH and LD_LIBRARY_PATH become long (one
> entry per module). Having every module's Debug/bin in PATH and
> Debug/lib in LD_LIBRARY_PATH gets hard to maintain when there's
> multiple environments.

Lets take a specific example, someone working on the clang front-end. For
these people, they will check out support,llvm, and clang modules. If
they *only* are playing with the front-end, and don't want to install,
they just need to add the clang bin directory to their path. If they also
want convenient access to the llvm tools, they can add that dir to their
path. This seems reasonable to me.

Regardless of the users PATH setting, the build process for the various
modules should invoke the tools from other modules *without* PATH
needing to be set.

Yeah, its more LD_LIBRARY_PATH that I'm concerned about. It can and does
screw up linking if its set wrong.

> Furthermore, the paths need to change
> when you switch to a release or release+asserts or release
> +expensive_checks build.

We have this problem today, it isn't a significant issue AFAICT.

> * There are inter-dependencies between modules which may affect
> the relative ordering of the PATH and LD_LIBRARY_PATH component
> paths.

This is only an issue if you have a name collision, right? IF so the
answer is "don't do that" :slight_smile:

Sure .. this entire email is about striking a balance between making the
build system "fool proof" and "flexible".

> * Building things can be affected because if you put the wrong
> directory in your LD_LIBRARY_PATH you can end up linking against
> libraries built by the compiler instead of your platform's
> native compiler, which will ultimately fail (very late too).

This is only for llvm-gcc?

Yes.

> * Having two llvm-gcc versions (4.0 and 4.2) in separate modules
> could lead to conflicts.

The only thing that depends on llvm-gcc is the llvm-test suite. It's
configure script should probably try to autodetect which C front-end you
have (4.0,4.2, clang) and "build in" the paths it needs into its
Makefile.config.

What if you "have" all three? Which does it pick? It probably needs to
be a configure or make option. In any event, if you switch compilers and
don't "make clean" you can end up with link errors (e.g. clang compiled
object linked with llvm-gcc-4.2)

> * Upstream projects like hlvm and cfe will have several
> dependencies so getting the paths straight is important for
> successful building. Additionally, users will have their own
> project directories, at the top of the food chain, which are
> dependent on everything.

I don't see this. Building should just be a matter of typing the moral
equivalent of "make". If clang used tblgen for its build, it would know
to invoke it from the llvm module, and would use an absolute path
generated by the makefile.

That's one way to do it :slight_smile:

> * We want to treat each module, as much as possible, as a separate
> entity (very loose coupling), but they are API locked anyway and
> we can't do much about that. The dependencies are real.

Yep. The dependencies are hard dependencies, though it would also be nice
to support "optional dependencies" down the line (if you check out "this"
in your tree, it enables "that" feature in some other dependent module).

Okay, you can implement that feature :slight_smile:

> * There are utilities that we want in the paths (like llvm/utils)
> as well as utilities like TableGen that might eventually be
> needed across projects (e.g. "core" would need TableGen for the
> intrinsic functions but the module containing the targets also
> needs it).

The makefiles that build the projects should not depend on PATH. The only
need for PATH to be set is if the user wants to invoke something (like
llvm-as, opt, etc).

Yeah, I mixed up two things here. So the user's path might want to have
all the llvm-top/*/utils directories in their PATH? And if you want to
keep TestRunner.sh in test, then you need that in your path, and ...
this just gets awkward quickly.

As for build utilities, I'm trying to strike a balance here between hard
coding paths (which make it brittle) and having it "just work". I think
its probably fine to put things in "support" that many other modules
will use. For example, the makefile system itself. That one module name
can be hard coded. But, as we go along, there may be other things used
(e.g. the hlvm utilities for all of hlvm's front ends) during build. To
avoid PATH for such things we need to know the module its in, *OR* just
always reference the installed thing and then you have perfect control
over which utility is being used, even while developing experimental
versions of that utility (into module/Debug/bin).

> * Does every module need its own "llvm-config" program?

It would be nice if this was shared, perhaps to live in the support
module?

I was thinking the same thing, but then it can't give you all the
project specific configuration, just the stuff that support knows about
(which is a lot, but not everything). Recall that llvm-config needs to
be built after all the libraries have been built so it can pick up all
the dependencies and generate the -l options in the right order. So,
conversely, it actually needs to be in a module that is the LAST one
built, not the first.

> * Some of us have multiple things going on at the same time and so
> work with multiple LLVM environments. For example, you could be
> working on an involved bug fix, your normal development work,
> quickie fixes, a branch for some side work, etc. In each of
> these cases you want a separate checkout and the associated
> environment variable settings for that directory. I call this an
> "environment". It is basically just a way to keep various works
> in progress separated. How can multiple environment be best
> supported?

Wow, those people need to learn to work more incrementally ;-). j/k

Oh, it is incremental .. there's just a lot of increments :slight_smile:

It seems that they should just check out multiple trees and have scripts
or something to set their PATH as appropriate... just like today.

I use this sort of thing when I have a frozen version of a tree for some
project, and we need to backport a patch to that tree. In this case, I
don't mess with my path at all, I just manually invoke utilities from that
tree with absolute paths.

I'm too forgetful for that. I want to run something that would put me in
"my backport patch branch environment" so I can just type "llvm-as" and
it gets the right one. Then all I have to do is remember to run that
thing.

> So, the question is .. what do you want to do about all this?
>
> Here are some options to be discussed:
>
> 1. Punt - Let each developer/user figure this out for the
> themselves.

This is the defacto answer until we get a solution :slight_smile:

Of course :slight_smile:

> 2. Install - That is, set your PATH and LD_LIBRARY_PATH to one
> place and "make install" the build results into that directory.

We *need* to support make install, but we also should not make it
required. End users just want to 'check out/download + build + install',
they don't want to mess with their environment or anything else for that
matter.

So, you're saying if you just want to use LLVM than invoke "install" and
set your path to the install location.

But, if you're a developer then .. see #1 :slight_smile:

> 3. Shell - Provide some shell functions and aliases to manage
> setting the environment correctly. This could even use the
> ModuleInfo.txt file to glean dependencies. For example,the
> llvm-top module could have a "setenv.sh" scrip that is invoked
> with ". ./setenv.sh"to set the environment for whatever is
> checked out in that llvm-top. We'd need one for each type of
> shell and users would have to remember to run it.

Ick.

Okay, disapproval noted. Got anything substantive to add to that?

> I need help with #4 but I'm also looking for general feedback on solving
> the issues raised.

To be clear, we're talking about LLVM developers here, not end users (who
just use make install). I think LLVM devs can know to add a directory or
two if they want convenient access to some llvm tool that gets built.
Worse case they can use absolute paths if they want.

So, basically .. #1 :slight_smile:

     * support - lib/Support, lib/System, autoconf, make support,
       utilities

which utilities? The C++ programs in llvm/utils should not be moved.

No, I was thinking more like "mkpatch" and "llvmgrep".

Makes sense.

     * core - VMCore, Asm, Bitcode and the essential IR tools (llvm-as,
       etc.)

I'm still not convinced that this is useful to split out from the rest of
the LLVM tree, we should discuss this again after support is split out.

Surely. About the only reason is for modules that only work with the IR
and don't want to have to compile all of the transforms and targets just
to use the IR. I know there's no technical difference, its merely a
developer convenience. In any event, we can discuss later. My point was
there will be more modules.

yep. we're on the same page.

Regardless of the users PATH setting, the build process for the various
modules should invoke the tools from other modules *without* PATH
needing to be set.

Yeah, its more LD_LIBRARY_PATH that I'm concerned about. It can and does
screw up linking if its set wrong.

It screws up *dynamic linking*, not linking.

This is only an issue if you have a name collision, right? IF so the
answer is "don't do that" :slight_smile:

Sure .. this entire email is about striking a balance between making the
build system "fool proof" and "flexible".

Just make it an error if llvm-config detects a conflict and you're done.

     * Building things can be affected because if you put the wrong
       directory in your LD_LIBRARY_PATH you can end up linking against
       libraries built by the compiler instead of your platform's
       native compiler, which will ultimately fail (very late too).

This is only for llvm-gcc?

Yes.

Okay, this is an issue, but it seems like we already have this issue. In any case, it can be fixed through magic (rpath?) that embeds information in the generated executable. I don't know much about this, but I am confident we can solve it if/when it is an issue :slight_smile:

     * Having two llvm-gcc versions (4.0 and 4.2) in separate modules
       could lead to conflicts.

The only thing that depends on llvm-gcc is the llvm-test suite. It's
configure script should probably try to autodetect which C front-end you
have (4.0,4.2, clang) and "build in" the paths it needs into its
Makefile.config.

What if you "have" all three? Which does it pick? It probably needs to
be a configure or make option. In any event, if you switch compilers and
don't "make clean" you can end up with link errors (e.g. clang compiled
object linked with llvm-gcc-4.2)

Okay, let me clarify: the configure script *defaults* to finding one of the things you have. The configure script should also provide an option that lets you choose a specific one. It would be wonderful if you could check out llvm-test three times and configure each one to use a different cfe.

equivalent of "make". If clang used tblgen for its build, it would know
to invoke it from the llvm module, and would use an absolute path
generated by the makefile.

That's one way to do it :slight_smile:

Yep, I think it's pretty important to do it this way.

     * We want to treat each module, as much as possible, as a separate
       entity (very loose coupling), but they are API locked anyway and
       we can't do much about that. The dependencies are real.

Yep. The dependencies are hard dependencies, though it would also be nice
to support "optional dependencies" down the line (if you check out "this"
in your tree, it enables "that" feature in some other dependent module).

Okay, you can implement that feature :slight_smile:

Hehe, "down the line" :slight_smile:

The makefiles that build the projects should not depend on PATH. The only
need for PATH to be set is if the user wants to invoke something (like
llvm-as, opt, etc).

Yeah, I mixed up two things here. So the user's path might want to have
all the llvm-top/*/utils directories in their PATH? And if you want to
keep TestRunner.sh in test, then you need that in your path, and ...
this just gets awkward quickly.

You only need to add the utilities that you invoke directly and want in your path. The standard build stuff (e.g. make test) can find the utilities they need.

As for build utilities, I'm trying to strike a balance here between hard
coding paths (which make it brittle) and having it "just work". I think
its probably fine to put things in "support" that many other modules
will use. For example, the makefile system itself. That one module name
can be hard coded. But, as we go along, there may be other things used
(e.g. the hlvm utilities for all of hlvm's front ends) during build. To
avoid PATH for such things we need to know the module its in, *OR* just
always reference the installed thing and then you have perfect control
over which utility is being used, even while developing experimental
versions of that utility (into module/Debug/bin).

I suggest that projects publish makefile fragments. For example, makefile.core could provide a fragment that includes:

LLVMToolDir := $(LLVMTopObjDir)/core/$(BuildMode)/bin
LLVMLibDir := $(LLVMTopObjDir)/core/$(BuildMode)/lib
LLVMAS := $(LLVMToolDir)/llvm-as$(EXEEXT)
LLC := $(LLVMToolDir)/llc$(EXEEXT)
LLI := $(LLVMToolDir)/lli$(EXEEXT)
...

If the hlvm project wnated to use this, it would just include $(LLVMTopDir)/core/Makefile.core

to get all these definitions. That way, you now have absolute paths available everywhere.

     * Does every module need its own "llvm-config" program?

It would be nice if this was shared, perhaps to live in the support
module?

I was thinking the same thing, but then it can't give you all the
project specific configuration, just the stuff that support knows about
(which is a lot, but not everything). Recall that llvm-config needs to
be built after all the libraries have been built so it can pick up all
the dependencies and generate the -l options in the right order. So,
conversely, it actually needs to be in a module that is the LAST one
built, not the first.

Great point. It sounds like llvm-config should be split into two parts: a common part shared by all the actual tools, and a project-specific part that "configures" llvm-config.

I'm too forgetful for that. I want to run something that would put me in
"my backport patch branch environment" so I can just type "llvm-as" and
it gets the right one. Then all I have to do is remember to run that
thing.

Sure, I agree that's useful, I just think it is something people can develop on their own. Alternatively, we can provide scripts in utils for people to use if they want.

    2. Install - That is, set your PATH and LD_LIBRARY_PATH to one
       place and "make install" the build results into that directory.

We *need* to support make install, but we also should not make it
required. End users just want to 'check out/download + build + install',
they don't want to mess with their environment or anything else for that
matter.

So, you're saying if you just want to use LLVM than invoke "install" and
set your path to the install location.

Yes.

    3. Shell - Provide some shell functions and aliases to manage
       setting the environment correctly. This could even use the
       ModuleInfo.txt file to glean dependencies. For example,the
       llvm-top module could have a "setenv.sh" scrip that is invoked
       with ". ./setenv.sh"to set the environment for whatever is
       checked out in that llvm-top. We'd need one for each type of
       shell and users would have to remember to run it.

Ick.

Okay, disapproval noted. Got anything substantive to add to that?

I think a combination of:
1) letting people add stuff to their PATH as they want to, and
2) providing simple scripts in utils for people who want to use them

is sufficient.

Worse case they can use absolute paths if they want.

So, basically .. #1 :slight_smile:

Yep. We can always improve it later.

-Chris

     2. Install - That is, set your PATH and LD_LIBRARY_PATH to one
        place and "make install" the build results into that directory.
  
I'm just starting to use LLVM, but this really feels like the easiest way to go. I've been a developer of the Liberty Simulation Environment for several years now, and the question of how to manage environments was an important one for us, which I think we got partially wrong.

We wanted to be able to have both inter-module shared bin/lib/share directories as well as the ability to direct each module to its own individual bin/lib/share and the ability to use the source directories for that as well (compiling in place). We went with a solution where each module had its own set of environment variables pointing to its bin/lib/share with a script which allowed the user to set all these values and then created the proper PATH and LD_LIBRARY_PATH and CLASSPATH and PYTHONPATH. How has this worked out? We're up to around 30 environment variables, but AFAIK none of us actually sets them to different values than the defaults -- no one tries to point them to the source directories -- though this is still supported and is somewhat hard to maintain. We've found it to be much easier to use the shared directories so we only have to think about a single environment variable. BTW, having so many variables creates a huge environment which has occasionally interfered with software which has to transfer it, such as Condor.

So, I'd suggest the use of a single LLVM_ROOT for the parent of a shared bin/lib/share and which you'd just use for the prefix when calling configure. This is also very compatible with installing LLVM as the system compiler.

An install-based system also makes it relatively painless to deal with branches, releases, multiple versions, and multiple architecture/OS combinations; you simply set different builddirs and install paths for each. The way I've organized this for myself in Liberty is with the following directory structure:

Liberty
    Trunk
            src
                various modules
            build
                arch/OS1
                arch/OS2
            Install
                   arch/OS1
                         bin
                         lib
                         share
                   arch/OS2
                         bin
                         lib
                         share
    Branch1
          <same structure>
    Tagged release-I-have-to-support
          <same structure>

            You can consider the level under Liberty to be the different environments. I then have a startLiberty "environment" script which takes an environment name (defaulting to Trunk) as its command-line argument, figures out the Install path taking into account the arch/OS on which it is running, and calls the Liberty environment variable setting script with appropriate options. Note that to make the switching of arch/OS work, the builddir and sourcedir must not be the same and when I switch the arch/OS I'm working on within an environment I have to clean out various autoconf caches and rerun configure. I've got another script which does that for me. But I don't know that LLVM really needs to provide scripts to manage at this level of granularity; certainly having the arch/OS support is overkill for someone who works on only one system and other Liberty developers have used different organizations.

--David

All,

After reviewing the feedback on this email (thanks!) and discussing it
on IRC we've decided to opt for flexibility but still provide some help,
as follows:

     1. Environment variables are the developer's business and we'll
        dictate no requirements here. Building llvm will work regardless
        of your environment variables.
     2. The "make install" approach will be supported throughout. If you
        want to roll your own environments, you can always do that.
        Several people preferred this approach and whatever we do you
        will still be able to "make install" and set your path
        accordingly.
     3. We will be providing a "support" module (its nearly ready) that
        will contain fundamental configuration (autoconf) scripts,
        system and support libraries, and the build system (makefiles).
        This will work in such a way that no environment variables are
        required. Where paths are needed, there will be configure script
        options to specify them.
     4. The "llvm-top" module is also nearly ready for use. This module
        is aimed at making it "dead simple" to checkout, build and
        install llvm software. It also sets up all the llvm modules in a
        single directory making it easier to create single directory
        environments. While its use will always be optional, it is also
        strongly recommended, especially for novices.
     5. We will provide some shell script functions to make switching
        your llvm-top based environments easier. These are entirely
        optional.

Thanks for the feedback. I'll keep you posted on the progress of this
stuff. Hopefully, LLVM gets much easier to deal with in the near future.

Reid.