Benchmarks

Dear List,

There's been some recent discussion on the list about benchmarks. I just
read a Dr. Dobbs article on the relative runtime performance of various
compilers (8 of them compared) on Intel platforms. The test focused on
mainly template type things but offers Dhrystone and zlib for
comparisons.

There's no clear winner as all compilers perform well in some areas and
poorly in others. The overall rating (Table 2 in the article) ranks the
compilers thusly (higher is better):

Intel 8.0 9.22
VC++ 7.1 7.56
CodeWarrior 7.44
GCC 3.2 6.67
VC++ 6.0 6.00
Comeau 4.3.3 4.56
Borland 5.6.4 3.78
Open Watcom 1.2 3.00

I am, of course, curious how LLVM stacks up in both native code
generation and the CBE. I guess the main point is that GCC is not the
compiler to beat, Intel 8.0 is (at least for the template based tests
this benchmark was aimed at). Any chance we can get this test suite
included in the benchmark?

You can find the article online here:

but you'll need to register.

Reid

I think this is a very good idea and we will definitely pursue it. Give us a little time, though. Several of the students are either doing their Ph.D. proposal or preparing to go away for summer internships or both. Adding the benchmarks and reporting initial numbers should not be too difficult, but reporting any meaningful numbers will take some work.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

There's been some recent discussion on the list about benchmarks. I just
read a Dr. Dobbs article on the relative runtime performance of various
compilers (8 of them compared) on Intel platforms. The test focused on
mainly template type things but offers Dhrystone and zlib for
comparisons.

Interesting, I'll definitely take a look. LLVM should do fairly well on
the tests I would guess.

I am, of course, curious how LLVM stacks up in both native code
generation and the CBE. I guess the main point is that GCC is not the
compiler to beat, Intel 8.0 is (at least for the template based tests
this benchmark was aimed at).

Yes, definitely. GCC is not the end-goal, and for that matter, neither is
ICC. :slight_smile:

Any chance we can get this test suite included in the benchmark?

There is a chance, but as Vikram mentioned, it's not extremely likely to
happen in the immediate future. However, if you or someone else wrote the
makefiles neccesary to the suite to the llvm/test/Programs hierarchy, we
would be happy to add them and have our automated testers run them. :slight_smile:

In particular, I'd really like to try to foster some more community
involvement in all aspects of LLVM, including coding, porting, testing,
documentation, performance tweaking, ... Our group obviously has a lot
invested in LLVM and we will continue doing lots of interesting things,
but unfortunately we do have limited people-bandwidth. If you *really*
want something done (like any open source project), the best way is to do
it yourself of convince someone to do it for you.

I'm not try to complain here: we have had a number of high-quality
contributions from several different people, I just wanted to point out
that asking us to do something isn't the only way to get it done. :slight_smile:

-Chris

I'd be happy to contribute more and in fact, there are lots of things I
could and would contribute. Pretty much the only thing that stops me is
the project's CVS policy. Submitting patches is fine for smaller tasks
(single file bug fixes etc.) but larger tasks (like adding a whole test
suite) really need CVS write access to be done efficiently and
correctly.

Perhaps its just me, but I don't have time to dicker with patch
creation, wait for individual files to be added, processing emails that
let me know when things have been added, wait for further responses on
modifications, etc. I have a full-time job, a full-time family, and a
full-time second career (erm, perhaps I'm just over-booked? :slight_smile: I need
some efficiency if I'm to do anything. From my perspective, the process
is _way_ more efficient if I can just create a branch, do what I need to
do and then tell you "look at branch xyz and merge if you like it". Its
possible with CVS to restrict access to main line commits to certain
users so you can still maintain control of the main development trend.
Furthermore, providing CVS access should reduce your administrative
burden. Instead of patching a pile of individual files, you can simply
look at the changes introduced by a branch and decide if its something
you want to keep or inform the branch author of the things that need to
be modified. You can also set the "gold standard" for contributions to
make sure that (a) branches are maintained with the mainline by the
author (thereby reducing mainline conflicts on merge), (b) the software
on the branch must build correctly, (c) the branch software must be well
integrated into the build system, (d) the branch software must not break
any existing tests, (e) etc., etc.

I understand the University has certain legal restrictions about
granting access to non-student and non-faculty users. That may be the
trump card that prevents wider use of CVS by contributors to LLVM. If
that is the case, I would suggest that (a) the project simply accept
that contributions from others will be minimal or (b) move the CVS
repository somewhere that doesn't have the University's restrictions.
That last option, however, may have additional intellectual property
issues.

While it would be unwise to freely grant write access to the CVS
repository to anyone that asked for it, you might want to think about
some qualifications necessary to allow that to happen in a controlled
fashion. I for one don't have any problems being asked to qualify for
CVS write access. If such access were available to serious and capable
contributors, I believe you'd get a lot more contributions (as I've seen
on other projects). Furthermore, the contributions are generally of a
higher quality because the technical requirements go up. Its something
of a self-sorting process.

Ultimately the decision is yours. LLVM is still great either way. :slight_smile:

Reid.

I'd be happy to contribute more and in fact, there are lots of things I
could and would contribute. Pretty much the only thing that stops me is
the project's CVS policy.

Wow, I really had no idea that this was such a problem!

(erm, perhaps I'm just over-booked? :slight_smile:

Heh, I know a little bit of that feeling :wink:

I need some efficiency if I'm to do anything. From my perspective, the
process is _way_ more efficient if I can just create a branch, do what I
need to do and then tell you "look at branch xyz and merge if you like
it".

Okay, that makes sense.

Its possible with CVS to restrict access to main line commits to
certain users so you can still maintain control of the main development
trend.

The problem is that CVS was not at all designed for this. It certainly is
possible to hack it enough with the right set of scripts though, and
several projects have them.

Furthermore, providing CVS access should reduce your administrative
burden. Instead of patching a pile of individual files, you can simply
look at the changes introduced by a branch and decide if its something
you want to keep or inform the branch author of the things that need to
be modified.

That would certainly be nice. :slight_smile:

I understand the University has certain legal restrictions about
granting access to non-student and non-faculty users. That may be the
trump card that prevents wider use of CVS by contributors to LLVM.

Unfortunately this is a big problem.

If that is the case, I would suggest that (a) the project simply accept
that contributions from others will be minimal or

Hrm, that's not very attractive :slight_smile:

(b) move the CVS repository somewhere that doesn't have the University's
restrictions. That last option, however, may have additional
intellectual property issues.

I don't think that there would be IP issues: LLVM is (effectively) BSD
licensed, so it could be forked at any time without a problem. This would
obviously be very bad for LLVM, but it's possible.

The more I've thought about this, the more that I'm beginning to realize
that CVS is the root of the problem. Perhaps it is time for LLVM to
seriously start looking at switching over to a decentralized version
control system? I really am not "up" on the various options, but I've
heard rumars that there are now several good options.

Take 'arch' for example: its approach seems like it would solve almost all
of the version control issues that we are facing, and supports
decentralized development in particular. From what I understand, you
would be able to do all of your development on your own "local" branch,
others could have access to it, and when it's ready, we could pull it in
as one big patch or set of changes.

From your perspective, would this solve the problem? I used arch just as
an example, I'm sure there are others (bitkeeper at least supports these
features, but has unattractive licensing issues).

If it is really time to switch version control systems we probably should
have someone do some research and find out which one is the most
appropriate. Assuming that we can come up with a reasonable transition
phase, I think that this could be done.

-Chris

I bought the issue and took a look. I suspect that LLVM will do extremely
well on these tests, but it doesn't look like there is a publically
available download for his benchmarks. I'm going to email the author and
see if we can get a copy.

-Chris

> (b) move the CVS repository somewhere that doesn't have the University's
> restrictions. That last option, however, may have additional
> intellectual property issues.

I don't think that there would be IP issues: LLVM is (effectively) BSD
licensed, so it could be forked at any time without a problem. This would
obviously be very bad for LLVM, but it's possible.

I don't see moving the repository to another system and forking the code
base as equivalent. I agree that its definitely not time to fork the
code base. But we can change the source code control repository without
forking.

The more I've thought about this, the more that I'm beginning to realize
that CVS is the root of the problem. Perhaps it is time for LLVM to
seriously start looking at switching over to a decentralized version
control system? I really am not "up" on the various options, but I've
heard rumars that there are now several good options.

Take 'arch' for example: its approach seems like it would solve almost all
of the version control issues that we are facing, and supports
decentralized development in particular. From what I understand, you
would be able to do all of your development on your own "local" branch,
others could have access to it, and when it's ready, we could pull it in
as one big patch or set of changes.

I've looked at subversion recently. Setting it up wasn't a big deal but
its much more complicated than CVS (e.g. you have to get a specific
version of Berkley DB). Also, there are enough problems with it that I
deem it unstable. The last thing we need is a buggy SCC system. I think
Subversion is a good choice (functionality wise), it just isn't quite
ready yet. Perhaps by release 1.2 or so the main issues will have
settled down. I haven't looked at arch but I will.

>From your perspective, would this solve the problem? I used arch just as
an example, I'm sure there are others (bitkeeper at least supports these
features, but has unattractive licensing issues).

The main problem isn't the SCC tool that we use. They all have their
good and bad points. The main problem is not being able to check things
in to a branch. As long as the SCC tool supports distributed and
parallel development (i.e. is WAN network based and supports branches),
it will be fine. So, I'm rejecting SCCS and RCS but not CVS.

If it is really time to switch version control systems we probably should
have someone do some research and find out which one is the most
appropriate. Assuming that we can come up with a reasonable transition
phase, I think that this could be done.

Sounds right. Its not a decision to be made lightly.

Reid.

I was looking for that too. The author has a web site for errata and
updates but there's nothing on it. Hopefully DDJ won't be too worried
about the IP.

Reid.

Reid,

There are no IP issues or restrictions I know of that prevent us from accepting contributions or providing direct CVS write access to non-UIUC people. If we can solve the technical issues, Chris and I would both be in favor of making write access available, in some controlled way. (As Chris said, I think it would be really unfortunate if we had to fork off the CVS repository but again, inside or outside the UIUC domain is a non-issue.)

Replacing CVS with something else may be our only option, but I would want to make sure that it is as widely available as CVS, so that prospective users don't have to download and install version management sofware to use llvm.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

Reid,

There are no IP issues or restrictions I know of that prevent us from
accepting contributions or providing direct CVS write access to
non-UIUC people.

That's good to know. When I originally asked for CVS write access (last
fall), the issue was raised that an account would have to be created on
a UIUC machine and that current policy does not permit that for non-UIUC
people.

If we can solve the technical issues, Chris and I
would both be in favor of making write access available, in some
controlled way.

Please note that it is possible with recent versions of CVS to provide
password protected access without creating a system account. This
facility, pserver, is already set up, and is how anonymous CVS access
works now. All that needs to be done is to add some users to the
$CVSROOT/CVSROOT/passwd file. You may additionally need to set up a
separate real (Unix) user to control access. For example "anoncvs" and
"pubcvs" are common to provide read-only and read-write access to the
CVS repository, respectively.

(As Chris said, I think it would be really unfortunate
if we had to fork off the CVS repository but again, inside or outside
the UIUC domain is a non-issue.)

The issues of forking LLVM code base and providing write access to CVS
are orthogonal. I fully agree that forking LLVM at this time would be
bad. Providing write access to the repository would be good. :slight_smile:

Replacing CVS with something else may be our only option, but I would
want to make sure that it is as widely available as CVS, so that
prospective users don't have to download and install version management
sofware to use llvm.

I wouldn't vote for replacing CVS at this time. Yes, its old and clunky
but it gets the job done and I'm sure you could live without the project
interruption right now.

In my opinion, Subversion is well suited to open source development and
is "familiar" for CVS users. However, moving to Subversion at this time
would be premature. It is just recently at version 1.0 and needs a
little time (few months perhaps) to settle down and get the kinks out.

Perhaps it would be prudent to wait for summer before thinking about
switching version control systems. The impact will be less and
Subversion should be in better shape by then. In the mean time,
providing write access to existing CVS server sounds doable both policy
wise and technically.

Reid.

I don't see moving the repository to another system and forking the code
base as equivalent. I agree that its definitely not time to fork the
code base. But we can change the source code control repository without
forking.

Sure, I didn't mean to say they were equivalent, it's just that they would
both solve this problem.

> Take 'arch' for example: its approach seems like it would solve almost all
> of the version control issues that we are facing, and supports
> decentralized development in particular. From what I understand, you
> would be able to do all of your development on your own "local" branch,
> others could have access to it, and when it's ready, we could pull it in
> as one big patch or set of changes.

I've looked at subversion recently. Setting it up wasn't a big deal but
its much more complicated than CVS (e.g. you have to get a specific
version of Berkley DB). Also, there are enough problems with it that I
deem it unstable. The last thing we need is a buggy SCC system. I think
Subversion is a good choice (functionality wise), it just isn't quite
ready yet. Perhaps by release 1.2 or so the main issues will have
settled down. I haven't looked at arch but I will.

Okay, I agree that dealing with a buggy SCC system is a bad idea. :slight_smile: Does
subversion support distributed development?

> >From your perspective, would this solve the problem? I used arch just as
> an example, I'm sure there are others (bitkeeper at least supports these
> features, but has unattractive licensing issues).

The main problem isn't the SCC tool that we use. They all have their
good and bad points. The main problem is not being able to check things
in to a branch. As long as the SCC tool supports distributed and
parallel development (i.e. is WAN network based and supports branches),
it will be fine. So, I'm rejecting SCCS and RCS but not CVS.

Okay, but one of the nice things about the distributed systems is that
branches don't need to be on the "central" machine. You can do all of
your development on a local branch (which you can optionally share with
others) and then when it's time, the entire branch can be trivially merged
back to mainline, preserving the revision history.

Vikram makes a good point, though: we don't want to unnecessarily raise
the bar of LLVM development to include having to get a non-standard SCC
system... hrm.

-Chris

Okay, I agree that dealing with a buggy SCC system is a bad idea. :slight_smile: Does
subversion support distributed development?

I think subversion will be an excellent choice when its ready. It
supports distributed development very well. It is CVSish (existing CVS
users will find it familiar). It handles branching and check-in much
more cleanly. For example, a check-in gets a revision number, not the
individual files. So, the entire set of what you checked in is retained
as a logical unit of work. To support authentication and access control
it has a plug-in to Apache HTTP Server so all the nice things about a
web server are available as well.

Okay, but one of the nice things about the distributed systems is that
branches don't need to be on the "central" machine. You can do all of
your development on a local branch (which you can optionally share with
others) and then when it's time, the entire branch can be trivially merged
back to mainline, preserving the revision history.

You can do this with Subversion and to some extent with CVS (the
"sharing" a branch thing is tricky in CVS).

Vikram makes a good point, though: we don't want to unnecessarily raise
the bar of LLVM development to include having to get a non-standard SCC
system... hrm.

Yup. subversion will be easy for existing CVS users. That's one of its
major design goals.

Reid.

Chris Lattner wrote:

The more I've thought about this, the more that I'm beginning to realize
that CVS is the root of the problem. Perhaps it is time for LLVM to
seriously start looking at switching over to a decentralized version
control system? I really am not "up" on the various options, but I've
heard rumars that there are now several good options.

Take 'arch' for example: its approach seems like it would solve almost all
of the version control issues that we are facing, and supports
decentralized development in particular. From what I understand, you
would be able to do all of your development on your own "local" branch,
others could have access to it, and when it's ready, we could pull it in
as one big patch or set of changes.

There are a couple of problems. First, arch is not portable to Windows. Are
you really sure nobody will port ALVA (or parts of it) to that platform?

Second, local repository is fine, but what if two persons ever decide to work
on the same branch?

- Volodya

Reid Spencer wrote:

> Take 'arch' for example: its approach seems like it would solve almost
> all of the version control issues that we are facing, and supports
> decentralized development in particular. From what I understand, you
> would be able to do all of your development on your own "local" branch,
> others could have access to it, and when it's ready, we could pull it in
> as one big patch or set of changes.

I've looked at subversion recently. Setting it up wasn't a big deal but
its much more complicated than CVS (e.g. you have to get a specific
version of Berkley DB). Also, there are enough problems with it that I
deem it unstable. The last thing we need is a buggy SCC system. I think
Subversion is a good choice (functionality wise), it just isn't quite
ready yet. Perhaps by release 1.2 or so the main issues will have
settled down. I haven't looked at arch but I will.

I think the decision is up to the developers, but I'd just like to note that
your opinion of Subversion is not the only one. In particular, we've being
using it at work for something like year (yes, long time before 1.0 was
released), and haven't run into any significant problems.

Maybe it means that right decision requires some experimenting.

- Volodya

> Take 'arch' for example: its approach seems like it would solve almost all
> of the version control issues that we are facing, and supports
> decentralized development in particular. From what I understand, you
> would be able to do all of your development on your own "local" branch,
> others could have access to it, and when it's ready, we could pull it in
> as one big patch or set of changes.

There are a couple of problems. First, arch is not portable to Windows. Are
you really sure nobody will port ALVA (or parts of it) to that platform?

Arch was just one example. :slight_smile: I'm not familiar with Alva, what is it
(google isn't particularly helpful)?

Second, local repository is fine, but what if two persons ever decide to work
on the same branch?

I believe that arch allows you to do this kind of thing:
http://www.gnu.org/software/gnu-arch/tutorial/shared-and-public-archives.html#Shared_and_Public_Archives

... but again, I haven't really spent the time to look into revision
control systems in any detail.

-Chris

Chris Lattner wrote:

> > Take 'arch' for example: its approach seems like it would solve almost
> > all of the version control issues that we are facing, and supports
> > decentralized development in particular. From what I understand, you
> > would be able to do all of your development on your own "local" branch,
> > others could have access to it, and when it's ready, we could pull it
> > in as one big patch or set of changes.
>
> There are a couple of problems. First, arch is not portable to Windows.
> Are you really sure nobody will port ALVA (or parts of it) to that
> platform?

Arch was just one example. :slight_smile: I'm not familiar with Alva, what is it
(google isn't particularly helpful)?

Actually, that's ALVA is what my spellchecker made from LLVM :-(. I meant: if
LLVM is ported to Windows, then arch will become a problem.

> Second, local repository is fine, but what if two persons ever decide to
> work on the same branch?

I believe that arch allows you to do this kind of thing:
http://www.gnu.org/software/gnu-arch/tutorial/shared-and-public-archives.ht
ml#Shared_and_Public_Archives

Right, but you'd need HTTP/FTP server. Not a problem for *me*, but lots of
folks are behind firewalls and can't do that.

... but again, I haven't really spent the time to look into revision
control systems in any detail.

Ok.

- Volodya

> > There are a couple of problems. First, arch is not portable to Windows.
> > Are you really sure nobody will port ALVA (or parts of it) to that
> > platform?
>
> Arch was just one example. :slight_smile: I'm not familiar with Alva, what is it
> (google isn't particularly helpful)?

Actually, that's ALVA is what my spellchecker made from LLVM :-(.

Whoa. :slight_smile:

I meant: if LLVM is ported to Windows, then arch will become a problem.

Excellent point. Windows compatibility is a must.

> > Second, local repository is fine, but what if two persons ever decide to
> > work on the same branch?
>
> I believe that arch allows you to do this kind of thing:
> http://www.gnu.org/software/gnu-arch/tutorial/shared-and-public-archives.ht
>ml#Shared_and_Public_Archives

Right, but you'd need HTTP/FTP server. Not a problem for *me*, but lots of
folks are behind firewalls and can't do that.

Sure. I can't imagine that there is a wonderful solution other than this
though. In particular, how can you do distributed development without it?
The whole idea is to reduce the need for a completely centralized
development model. Ideally, the UIUC servers would just be the "official"
tree: lots of development could happen publically in trees that are not on
our servers. When the development is done, or in a good state, it could
be merged into the official tree, and be released in the standard
releases.

I don't know if there is any ideal tool available today that satisfies
these goals, but it is my personal ideal in a source-control system.
(Assuming that you can revision directories, have atomic commits, and all
of the other things that CVS doesn't :wink:

The only two systems that I'm familiar with that support this kind of
development are Arch (not available under windows?) and Bitkeeper (many
licencing restrictions). I don't think that subversion supports the
features above, but again, I really have little idea of what I'm talking
about. :slight_smile:

-Chris

Chris Lattner wrote:

> Right, but you'd need HTTP/FTP server. Not a problem for *me*, but lots
> of folks are behind firewalls and can't do that.

Sure. I can't imagine that there is a wonderful solution other than this
though. In particular, how can you do distributed development without it?
The whole idea is to reduce the need for a completely centralized
development model. Ideally, the UIUC servers would just be the "official"
tree: lots of development could happen publically in trees that are not on
our servers. When the development is done, or in a good state, it could
be merged into the official tree, and be released in the standard
releases.

Why do you really need distributed development? The possible problems with
centralized development are
1. The server might be often down.
2. There's too much number of active branches, so nobody understand what's
going on.
3. You can't commit while you're on a plane.

I don't think first two points are that important, and I never being on a
plane so can't comment on the third.

Ok, anyway, you decide.

- Volodya

Vladimir Prus <ghost@cs.msu.su> writes:

Why do you really need distributed development? The possible problems with
centralized development are
1. The server might be often down.
2. There's too much number of active branches, so nobody understand what's
going on.
3. You can't commit while you're on a plane.

Replace 3 with "You have no permanent internet connection, or you are
behind a firewall, so you can not access the server at all (no diffs,
no logs, nothing)".

BTW, before considering arch too seriously, you should check how
mature/stable it is. Last time I heard about it, Tom Lord was pleading
for help and funding to finish Arch.

OTOH, Subversion is just a sane CVS. No distributed repositories.

With CVS, some people keep a copy of the main repository on their
local computers. That's what some gcc developers do. They rsync from
time to time with the remote repository. I don't know how serious
the incoveniences are with this approach.

BitKeeper, due to its license, is a no-no, IMHO.

[snip]

Vladimir Prus <ghost@cs.msu.su> writes:

> Why do you really need distributed development? The possible problems with
> centralized development are
> 1. The server might be often down.
> 2. There's too much number of active branches, so nobody understand what's
> going on.
> 3. You can't commit while you're on a plane.

Replace 3 with "You have no permanent internet connection, or you are
behind a firewall, so you can not access the server at all (no diffs,
no logs, nothing)".

And #4: it makes permissions on the server much easier to deal with.

BTW, before considering arch too seriously, you should check how
mature/stable it is. Last time I heard about it, Tom Lord was pleading
for help and funding to finish Arch.

Yes, I think that Tom Lord is the single biggest problem with Arch. :slight_smile:
OTOH, Arch is now a gnu project, so perhaps its better now. I any case,
everything that I know about it is dated. :slight_smile:

OTOH, Subversion is just a sane CVS. No distributed repositories.

With CVS, some people keep a copy of the main repository on their
local computers. That's what some gcc developers do. They rsync from
time to time with the remote repository. I don't know how serious
the incoveniences are with this approach.

Yeah, that's a solution, but it's such a hack! :slight_smile:

BitKeeper, due to its license, is a no-no, IMHO.

Agreed.

-Chris