Compiler Driver Decisions

I dunno. Perhaps cause Misha liked it. But, you do have a point there.

For that matter, why llvmc? Its more than a compiler. It also (and
mainly) links and optimizes. So, why not just "llvm" ?

Reid.

Hrm, I don't think we want to overload "llvm" to mean yet-another-concept.

It's already the name of the project and the IR... this causes enough
confusion as it is. What trouble could one extra little 'c' cause? :slight_smile:

-Chris

> > > 1. Name = llvmcc
> >
> > Why not 'llvmc' "llvm compiler"? What does the extra C mean?
>
> I dunno. Perhaps cause Misha liked it. But, you do have a point there.

LLVMCC = LLVM Compiler Collection, a la GCC
After all, it's going to be the "driver", like GCC, and unify
front-ends, so I should be able to do:

% llvmcc a.java -o a.o
% llvmcc b.cpp -o b.o

Right?

> For that matter, why llvmc? Its more than a compiler. It also (and
> mainly) links and optimizes. So, why not just "llvm" ?

LOC : LLVM Optimizing Compiler. :slight_smile:
In case LOC doesn't have enough meanings already...

Hrm, I don't think we want to overload "llvm" to mean
yet-another-concept.

I agree, but everyone at some point starts thinking of 'LLVM' as *the*
compiler, so perhaps people ALREADY have that viewpoint.

It's already the name of the project and the IR... this causes enough
confusion as it is. What trouble could one extra little 'c' cause? :slight_smile:

I think some terminology clarification would be in order... :slight_smile:

> > Since there's been little feedback on the design document I sent out,
> > some decisions are being made in order to progress the work. If you have
> > strong feelings about any of these, voice them now!
> >
> > 1. Name = llvmcc
>
> Why not 'llvmc' "llvm compiler"? What does the extra C mean?

I dunno. Perhaps cause Misha liked it. But, you do have a point there.

For that matter, why llvmc? Its more than a compiler. It also (and
mainly) links and optimizes. So, why not just "llvm" ?

How about you leave it as llvmc for now and when its functioning you can
revisit this subject. Seems like a minor thing to waste time discussing.

-Tanya

Perhaps, but I have to create directories and documents and content for
those things that use this name. I really don't want to go back and
revisit everything I'm about to write. Not that its hard, its just a
pain.

Reid.

> > > > 1. Name = llvmcc
> > >
> > > Why not 'llvmc' "llvm compiler"? What does the extra C mean?
> >
> > I dunno. Perhaps cause Misha liked it. But, you do have a point there.

LLVMCC = LLVM Compiler Collection, a la GCC
After all, it's going to be the "driver", like GCC, and unify
front-ends, so I should be able to do:

% llvmcc a.java -o a.o
% llvmcc b.cpp -o b.o

Right?

Absolutely. The problem is that "C compiler" is what people think of when
they see CC. This we certainly are not. If we are really a compiler of
code, why not just call it llvmc? Also, just because GCC set a precedent
here does not mean that it needs to be followed. Their renaming to
compiler collection is largely due to historical reasons.

> It's already the name of the project and the IR... this causes enough
> confusion as it is. What trouble could one extra little 'c' cause? :slight_smile:

I think some terminology clarification would be in order... :slight_smile:

I'm just advocating not making the situation worse :slight_smile:

-Chris

I actually like Misha's point here. Most people that have used GCC
recently realize that the CC means "Compiler Collection" and not "C
Compiler" which is appropriate given what it does. Since we intend to be
front end language agnostic and the driver tool will support multiple
front end languages, "Compiler Collection" is appropriate for LLVM too.

I agree that llvm is overloaded and should be avoided. So its either
llvmc or llvmcc. My vote is for the latter.

Reid.

What is the difference between a "compiler collection" and a "compiler"?
how about llvmcs "llvm-compiler system" or something else non-cc? :slight_smile:

-Chris

The difference is that most people associate the word "compiler" with a
single language: e.g. the C++ compiler, the Pascal compiler, the Fortran
compiler. But, this driver tool isn't the compiler for any language that
LLVM now or will ever support. Other tools are "the compiler". What it
does do is invoke those compilers. Because it can invoke many of those
language compilers, even multiple on the same execution, the notion of
"compiler collection" is pretty accurate. However, if you want to avoid
the cc suffix, lets explore some alternatives:

llvmcd - llvm compiler driver
llvmci - llvm compiler invoker
llvmcs - llvm compiler system (or perhaps "compilation system")
llvmct - llvm compiler tool
llvmx - llvm eXecutive

Reid.

I think that we can really pick any name that we want. We don't have to go
along with "tradition" and name it "cc". Personally, compiler collection
is kinda lame anyways. I'm all about something short though, so "llvmc" is
my vote.

Ok. So i voted, so now can we tally the votes and put this discussion to
an end? :wink:

-Tanya

I think that we can really pick any name that we want. We don't have to go
along with "tradition" and name it "cc". Personally, compiler collection
is kinda lame anyways. I'm all about something short though, so "llvmc" is
my vote.

A lot of the names are fine with me, I'm just trying to find a crowd
pleaser. However, brevity is high on my list too. How about "doit" :slight_smile:

Ok. So i voted, so now can we tally the votes and put this discussion to
an end? :wink:

I think I need to wait for Misha and Vikram to weigh in on this
discussion before we take it to a vote. Hopefully, there will be some
consensus on the name soon :slight_smile:

Reid.

> What is the difference between a "compiler collection" and a "compiler"?
> how about llvmcs "llvm-compiler system" or something else non-cc? :slight_smile:

The difference is that most people associate the word "compiler" with a
single language: e.g. the C++ compiler, the Pascal compiler, the Fortran
compiler. But, this driver tool isn't the compiler for any language that

Ok.

LLVM now or will ever support. Other tools are "the compiler". What it
does do is invoke those compilers. Because it can invoke many of those
language compilers, even multiple on the same execution, the notion of
"compiler collection" is pretty accurate. However, if you want to avoid
the cc suffix, lets explore some alternatives:

llvmcd - llvm compiler driver
llvmci - llvm compiler invoker
llvmcs - llvm compiler system (or perhaps "compilation system")
llvmct - llvm compiler tool
llvmx - llvm eXecutive

How about llvm-foo? We can retroactively come up with the meaning for foo
later :slight_smile:

Alternatively, I like llvmcs or llvmct, either work for me, though I lean
towards llvmcs. :slight_smile:

-Chris

I can live with "llvmcs: LLVM Compilation System" .. it has a cute
overtone to "Computer Science" too.

So, now that you and I agree, lets see if Tanya, Misha, and Vikram will
agree with this.

Tanya: can you live with six lettes instead of five?
Misha: can you live with one "c" in your name being "s" instead?
Vikram: can you find time to respond to this? ;>

Reid.

I like llvmcs. Contrary to the IRC discussion, I am not sure I want a
hyphen in this ... Without a hyphen, it could still be the compiler
system, with the hyphen, I'd say almost definitely computer science. :slight_smile:

Not that there's anything wrong with it, just weird... :slight_smile:

I'm happy with llvmcs or llvmcc.

For those of you who remember OPEN LOOK, I would laughingly suggest 'llvmtool'.
:slight_smile:

-Brian

I also notice that cs has the double-meaning of llvm.cs[.uiuc.edu] or
just simply llvm.cs, i.e., LLVM implemented in C#. :slight_smile:

I have been at Microsoft the last couple of days and so couldn't join the discussion earlier. Here's my view of the name issue, and (the reason this is long), a little about how I think we want users to view this tool:

First, I think the name should convey the purpose of the tool -- otherwise, it just creates a confusing acronym (and goodness knows we have enough names already, even though most of them are clear). Unfortunately this leads me to vote against llvmcs -- it's vague and (worse) a misnomer. A "system" to a program or a compiler driver or an invoker or anything specific like that. *LLVM* is a system; this program is a program with a more limited purpose. Of the list below, llvmcd comes closest to describing what this tool does.

Another possible name is llvmgen (even though I do think it should have options to compile to native code). This fits slightly better with my view of how users should view the compilation process in LLVM (but llvmcd is ok too).

I would like to encourage users to think about the "normal" output of our compiler as llvm code, not native code. There are many reasons for this but here are two key ones:

(1) Most compilations are run one or several files at a time, so interprocedural optimizations cannot be done when running the compiler driver. Modern compilers present a misleading view that front-ends generate object code. That's not true at higher levels of optimization but users never realize that -- front-ends really generate some kind of IR file that is read by the link-time whole-program optimizer. We can avoid this confusing view by making llvm output the "normal" and default case.

(2) Once the dynamic optimizer Brian and Tanya and Anand have worked on is ready for distribution, I would like to make llvmee (Misha's llvm execution environment) the expected, default way to run llvm programs. The llvm driver (llvmcd or llvmgen or whatever name we pick) would generate llvm code that is later compiled and run transparently by llvmee. Whether the code generation to native code happens offline (via llc) or online (via the jit) can be controlled via llvmee. Again, this requires reinforcing the view that shipped programs are in llvm form, not native form.

Note that none of this prevents people from using the driver to generate native code in one step. But the tool names and default behavior should convey our view of what is important about LLVM. I think we *don't* want users to view llvm as just another standard source-to-native code compiler.

For the name, again, I'd be happy with either llvmcd or llvmgen, with a preference for the latter. The more important thing is to make llvm code be the default output.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

First, I think the name should convey the purpose of the tool --
otherwise, it just creates a confusing acronym (and goodness knows we
have enough names already, even though most of them are clear).

Yes, I totally agree.

Unfortunately this leads me to vote against llvmcs -- it's vague and
(worse) a misnomer. A "system" to a program or a compiler driver or an
invoker or anything specific like that. *LLVM* is a system; this
program is a program with a more limited purpose. Of the list below,
llvmcd comes closest to describing what this tool does.

I don't like llvmcs either, though I don't know of a good name. The
problem is that we are entering the realm of LLVM as a *compiler* not as a
bunch of utilities for building compilers. As such, the compiler driver
(which is an LLVM tool) is really the users interface to a C or C++ or
Java compiler. As such, they don't want to think of it as a compiler
driver, they want to think of it as THE compiler. When we say 'gcc foo.c'
we are actually invoking the GCC compiler driver, though we name it 'gcc'.

This is part of why it is critical to have a good name :). Note however,
that it may be perfectly reasonable to just name the tool llvm-driver or
something, then have individual compilers built from LLVM install symlinks
into /usr/bin. Given that, they could call themselves whatever they want,
while still invoking llvm-driver.

I would like to encourage users to think about the "normal" output of
our compiler as llvm code, not native code. There are many reasons for
this but here are two key ones:

I agree, we definitely want to support this, but we also want to be able
to support traditional static compilation as well. In particular, there
should be some flag (-native?) or something that you can give to the
compiler to direct it to produce a native executable.

-Chris

I have been at Microsoft the last couple of days and so couldn't
jointhe discussion earlier.

No worries. I knew you'd chime in sooner or later :slight_smile:

Here's my view of the name issue, and (thereason this is long), a
little about how I think we want users to viewthis tool:

First, I think the name should convey the purpose of the tool ...

Another possible name is llvmgen (even though I do think it should
have options to compile to native code). ...

I would like to encourage users to think about the "normal" output of
our compiler as llvm code, not native code. There are many reasonsfor
this but here are two key ones:

(1) Most compilations are run one or several files at a time, so
interprocedural optimizations cannot be done when running the compiler
driver. Modern compilers present a misleading view that front-ends
generate object code. That's not true at higher levels of
optimization but users never realize that -- front-ends really
generate some kind of IR file that is read by the link-time
whole-program optimizer. We can avoid this confusing view by making
llvm output the "normal" and default case.

(2) Once the dynamic optimizer Brian and Tanya and Anand have worked
on is ready for distribution, I would like to make llvmee (Misha's
llvm execution environment) the expected, default way to run llvm
programs. The llvm driver (llvmcd or llvmgen or whatever name we
pick) would generate llvm code that is later compiled and run
transparently by llvmee. Whether the code generation to native code
happens offline (via llc) or on-line (via the jit) can be controlled
via llvmee. Again, this requires reinforcing the view that shipped
programs are in llvm form, not native form.

Note that none of this prevents people from using the driver to
generate native code in one step. But the tool names and default
behavior should convey our view of what is important about LLVM. I
think we *don't* want users to view llvm as just another standard
source-to-native code compiler.

I hadn't considered this perspective and I agree. In fact, I agree to
the point that I would consider naming the tools llvm-gen and llvm-run.
That is, llvm-gen does whatever is necessary to completely or partially
create a 100% LLVM (i.e. bytecode) executable. And, llvm-run (what
you're calling llvmee) would do whatever is necessary to run and
possibily re-optimize that program. That should be the normal and
default behavior of the two tools.

Since its also important for users to feel somewhat comfortable with
their new toolkit, I think we should support an optional argument,
possibly -native, that would cause llvm-gen to work towards a native
executable instead of an LLVM executable.

For the name, again, I'd be happy with either llvmcd or llvmgen, witha
preference for the latter. The more important thing is to makellvm
code be the default output.

I'm not thrilled with llvm-gen because it has overtones of "code
generation" (which actually isn't far from the truth). But, it conjures
up the wrong kind of tool .. something more like yacc or bison. However,
I don't have any good alternatives in mind. I'll ponder some more.

Reid.

> Unfortunately this leads me to vote against llvmcs -- it's vague and
> (worse) a misnomer. A "system" to a program or a compiler driver or an
> invoker or anything specific like that. *LLVM* is a system; this
> program is a program with a more limited purpose. Of the list below,
> llvmcd comes closest to describing what this tool does.

I don't like llvmcs either, though I don't know of a good name. The
problem is that we are entering the realm of LLVM as a *compiler* not as a
bunch of utilities for building compilers. As such, the compiler driver
(which is an LLVM tool) is really the users interface to a C or C++ or
Java compiler. As such, they don't want to think of it as a compiler
driver, they want to think of it as THE compiler. When we say 'gcc foo.c'
we are actually invoking the GCC compiler driver, though we name it 'gcc'.

I agree with this, but ..

This is part of why it is critical to have a good name :). Note however,
that it may be perfectly reasonable to just name the tool llvm-driver or
something, then have individual compilers built from LLVM install symlinks
into /usr/bin. Given that, they could call themselves whatever they want,
while still invoking llvm-driver.

.. not with this. I don't want to force compiler writers to have to make
yucky symlinks to llvm-driver. Also, I think it would be good for all
involved if they could think of LLVM (via their use of the driver) as
this magic black box that just gets compilation done, regardless of the
languages involved. Otherwise they will have to remember the weird
names used by all the language developers for all the different language
compilers. I'd rather hide that complexity from the user and just
provide them with a one-stop tool that does LLVM compilation. Its also
very useful for language developers to just "slot in" a new version of
the front end with appropriate configuration tools and the driver never
has to change.

> I would like to encourage users to think about the "normal" output of
> our compiler as llvm code, not native code. There are many reasons for
> this but here are two key ones:

I agree, we definitely want to support this, but we also want to be able
to support traditional static compilation as well. In particular, there
should be some flag (-native?) or something that you can give to the
compiler to direct it to produce a native executable.

As I said in my last message, I think the default behavior should be
that the driver works towards executing LLVM bytecode (in whatever way
appropriate) and we provide the -native option for those that want to
force the driver to produce native executables (but not native object
files!)

Reid.