Is there room for another build system?

David Greene <dag@cray.com> writes:

For my test suite I use Tcl (with TclX, no Expect). It watches stdout
and stderr, gets exit codes and has a timer for killing hanged
processes. Process control works the same on Windows and Unix and takes
a less than 30 lines of code.

What else do you need?

A way to examine asm output and compate to expected patterns.

This is a text manipulation task, isn't it? No problem.

A way to run performance regression tests (i.e. looking at CPU time
and allowing for fuzz between test runs).

Do you mean that you want a language with that feature built-in?

In my experience Tcl is very hard to work with due to the
non-existence of any reasonable debugging tools on all platforms.

For one reason or another, my Tcl code never grows so large that I miss
a debugger. A `puts' here and there plus the repl is enough :slight_smile:
Message: 6
Date: Mon, 4 Aug 2008 17:48:47 -0500
From: David Greene <dag@cray.com>
Subject: Re: [LLVMdev] Is there room for another build system?
To: llvmdev@cs.uiuc.edu
Message-ID: <200808041748.48057.dag@cray.com>
Content-Type: text/plain; charset="iso-8859-1"

David Greene <dag@cray.com> writes:

For my test suite I use Tcl (with TclX, no Expect). It watches stdout
and stderr, gets exit codes and has a timer for killing hanged
processes. Process control works the same on Windows and Unix and takes
a less than 30 lines of code.

What else do you need?

A way to examine asm output and compate to expected patterns.

This is a text manipulation task, isn't it? No problem.

No problem with Perl either, or Python. Tcl is much less well-known.

Note that I don't particularly like any of these languages but I'm trying
not to let personal preference get in the way. :slight_smile:

(Is there a scripting language that you like?)

A way to run performance regression tests (i.e. looking at CPU time
and allowing for fuzz between test runs).

Do you mean that you want a language with that feature built-in?

No, I mean in the future we should have tests that actually pass/fail based
on their runtime performance. To do that you need a way to time the test
and a way to account for normal system variations (the fuzz bit).

We don't have any of these kinds of tests yet. But I hope we do in the
future.

In my experience Tcl is very hard to work with due to the
non-existence of any reasonable debugging tools on all platforms.

For one reason or another, my Tcl code never grows so large that I miss
a debugger. A `puts' here and there plus the repl is enough :slight_smile:

How many people know Tcl? That has a direct impact on maintanability.

IMHO, anyone that is discussing LLVM internals will have no trouble with any popular scripting language, including Tcl.

I'm not expert in any of these languages, nor do I have a strong preference. But I have some impressions:

C/C++: I guess everyone here already understands the pros and cons of these languages. :slight_smile:

sh/csh/bash/zsh: Widely understood and used, but wordy and there are portability issues. (Perhaps someone with Ming and Cygwin experience can speak to shell portability?) It's my understanding that autoconf writes shellscripts that do configuration for many open-source packages; IOW, package owners avoid writing shellscripts, instead treating shell as a low-level interpreter for their autoconf programs.

Python is elegant and scales well to large codebases IMHO. Python seems to be well-supported and portable, with a large and growing library and an active development community. Whitespace is semantically important in Python; many people find this "feature" intolerable. I understand the 3.x revision of Python will break some older Python codes, and this is deliberate; another poster here says that every previous Python minor release has broken his existing code. (I can't predict whether any Python scripts for LLVM would suffer similarly.) Python is fully OO with structures and classes, and there are standardized classes for dictionaries (index an array with a non-integer) and regular expressions. A Python program can be "extended" with C code, in that an existing Python class or module can be re-cast in C, with a resulting reduction in friendliness and portability, and a commensurate improvement in speed. http://www.python.org

Perl generally frightens me (there's a reason it's called "the swiss-army chainsaw"), but it seems to be relatively portable (runs on Windows), and there's a large and active developer community continually enlarging a huge pile of packages to extend it in myriad inconceivable directions. Perl the language is extremely powerful and often terse to the point of inscrutability; specifically, every ASCII punctuation character seems to be an operator, and almost any punctuation juxtaposed with an "=" is another. Variables in Perl always begin with "$" (scalar) or "@" (array), and "$_", "$|", and "@_" are important variables that every nontrivial Perl script will use. The semantics of Perl argument passing are ... unfortunate, although it seems to work well enough in practice. Perl claims to support extensions written in C, but I am wholly unfamiliar with this feature. Perl has the most popular datatypes (string, array, dictionary, scalar) included in the base syntax, and it supports structures and classes, albeit with a strange syntax ("$self- >{field3}"). Longstanding Perl convention seems to avoid the OO features added in Perl 5, preferring to store structures as lists of lists (a la' LISP). http://www.perl.org

Tcl ("Tool Command Language") was developed to be a simple language, and easily extended using C. Tcl is old, well-supported and portable (runs on Windows). I find the base language to be simplistic and inelegant, and I never figured out the quoting/expansion rules, but the extension feature is very powerful. The justly-famous Tk library made it possible to write GUI apps in a scripting language, and IMHO Tk is a significant part of Tcl's success (of course, it's unlikely that Tk would be useful to LLVM). The base Tcl language includes dictionaries and regular expressions, but only one datatype: string, and one-dimensional arrays thereof. (Multi-dimensional arrays are simulated with dictionaries: array(1,3)' is interpreted as 'array("1,3")'.) http://www.tcl.tk

Expect is an extension of Tcl; literally, Tcl extended with some new verbs and constructs. Expect creates and manages pseudo-ttys so that it can pretend to be human and manage interactive situations. Expect/Tcl is the language of GCCs DejaGNU testsuite, probably because it easily automates testing on development boards (talk to ROM, compile & download test, get results, etc). The utility of Expect shouldn't be underestimated. Perl, Python and Ruby all have cloned the Expect functionality with one or more extensions/packages/modules/voodoo, but it's not clear to me that any of these extensions are widely used or well-supported. http://expect.nist.gov

Excepting C/C++, every language above omits variable declarations; variables spring into existence at first mention. Every language above claims to have some sort of debugger, although I can't speak to the quality of these. Perl has a standardized debugger; the other languages all seem to have a multitude of debuggers of varying quality and/or support.

I believe that OS X and every modern Linux distribution all support all of these languages, either by default or with the installation of a package. I hope others with Windows experience can inform us how well these work in the Windows environment.

I can't speak to Ruby or Lua; perhaps someone else on this list could inform us.

Are there any other languages to consider? (AWK is too limited IMHO, and superseded by Perl anyway.) REXX?

stuart

sh/csh/bash/zsh: Widely understood and used, but wordy and there are portability issues. (Perhaps someone with Ming and Cygwin experience can speak to shell portability?)

IMHO, adding even more reliance on mingw/cygwin to make things work on Windows is bad.

-sr

Stuart Hastings wrote:

David Greene <dag@cray.com> writes:

For my test suite I use Tcl (with TclX, no Expect). It watches stdout
and stderr, gets exit codes and has a timer for killing hanged
processes. Process control works the same on Windows and Unix and takes
a less than 30 lines of code.

What else do you need?
        

A way to examine asm output and compate to expected patterns.
      

This is a text manipulation task, isn't it? No problem.

A way to run performance regression tests (i.e. looking at CPU time
and allowing for fuzz between test runs).
      

Do you mean that you want a language with that feature built-in?

In my experience Tcl is very hard to work with due to the
non-existence of any reasonable debugging tools on all platforms.
      

For one reason or another, my Tcl code never grows so large that I miss
a debugger. A `puts' here and there plus the repl is enough :slight_smile:
Message: 6
Date: Mon, 4 Aug 2008 17:48:47 -0500
From: David Greene <dag@cray.com>
Subject: Re: [LLVMdev] Is there room for another build system?
To: llvmdev@cs.uiuc.edu
Message-ID: <200808041748.48057.dag@cray.com>
Content-Type: text/plain; charset="iso-8859-1"

David Greene <dag@cray.com> writes:
      

For my test suite I use Tcl (with TclX, no Expect). It watches stdout
and stderr, gets exit codes and has a timer for killing hanged
processes. Process control works the same on Windows and Unix and takes
a less than 30 lines of code.

What else do you need?
          

A way to examine asm output and compate to expected patterns.
        

This is a text manipulation task, isn't it? No problem.
      

No problem with Perl either, or Python. Tcl is much less well-known.

Note that I don't particularly like any of these languages but I'm trying
not to let personal preference get in the way. :slight_smile:
    
(Is there a scripting language that you like?)

A way to run performance regression tests (i.e. looking at CPU time
and allowing for fuzz between test runs).
        

Do you mean that you want a language with that feature built-in?
      

No, I mean in the future we should have tests that actually pass/ fail based
on their runtime performance. To do that you need a way to time the test
and a way to account for normal system variations (the fuzz bit).

We don't have any of these kinds of tests yet. But I hope we do in the
future.

In my experience Tcl is very hard to work with due to the
non-existence of any reasonable debugging tools on all platforms.
        

For one reason or another, my Tcl code never grows so large that I miss
a debugger. A `puts' here and there plus the repl is enough :slight_smile:
      

How many people know Tcl? That has a direct impact on maintanability.
    
IMHO, anyone that is discussing LLVM internals will have no trouble with any popular scripting language, including Tcl.
  

Other than learning curve, yes.

I'm not expert in any of these languages, nor do I have a strong preference. But I have some impressions:

...

sh/csh/bash/zsh: Widely understood and used, but wordy and there are portability issues. (Perhaps someone with Ming and Cygwin experience can speak to shell portability?)

I'm indulging in this exercise to enable testing a native MingW32 build of LLVM in Windows.

There are more portability issues *between* shells, than across OS's. If I go ahead with targeting bash, I suspect (by avoiding bash extensions and otherwise being careful) that the resulting script should work on any recent Bourne compatible shell. csh will not be supported at all (incompatible test operator syntax).

Note that shell scripts can coordinate invoking other languages/tools; targeting bash doesn't rule out using Tcl/Perl/Python/etc. where convenient.

With best regards,
Kenneth

Kenneth Boyd <zaimoni@zaimoni.com> writes:

I'm indulging in this exercise to enable testing a native MingW32 build
of LLVM in Windows.

If LLVM's DejaGNU usage is the same as GCC's, I'll google or ask on the
MinGW mailing list how MinGW testers run the GCC testsuite, before
trying to fix something that maybe is not broken.

There are more portability issues *between* shells, than across OS's.
If I go ahead with targeting bash, I suspect (by avoiding bash
extensions and otherwise being careful) that the resulting script should
work on any recent Bourne compatible shell. csh will not be supported
at all (incompatible test operator syntax).

Note that shell scripts can coordinate invoking other languages/tools;
targeting bash doesn't rule out using Tcl/Perl/Python/etc. where convenient.

AFAIK, there is no "native" port of `bash' on Windows. If you plan to
use Cygwin's (or MSYS', which is a fork of Cygwin) you will discover
that it is quite tricky to work with non-Cygwin processes (including
MinGW's gcc) due to differences on directory structures, I/O, process
control, etc.

Óscar Fuentes wrote:

Kenneth Boyd <zaimoni@zaimoni.com> writes:

I'm indulging in this exercise to enable testing a native MingW32 build of LLVM in Windows.
    
If LLVM's DejaGNU usage is the same as GCC's, I'll google or ask on the
MinGW mailing list how MinGW testers run the GCC testsuite, before
trying to fix something that maybe is not broken.
  

Note that the official MinGW GCC binaries generally are not bootstrapped; they're cross-compiled (presumably from CygWin). [In particular, both the 3.4.5 and 4.2.1 MinGW binaries of gcc are not bootstrapped.] I use MinGW rather than CygWin for political reasons, so running DejaGNU under CygWin isn't a real option for me.

There are more portability issues *between* shells, than across OS's. If I go ahead with targeting bash, I suspect (by avoiding bash extensions and otherwise being careful) that the resulting script should work on any recent Bourne compatible shell. csh will not be supported at all (incompatible test operator syntax).

Note that shell scripts can coordinate invoking other languages/tools; targeting bash doesn't rule out using Tcl/Perl/Python/etc. where convenient.
    
AFAIK, there is no "native" port of `bash' on Windows. If you plan to
use Cygwin's (or MSYS', which is a fork of Cygwin) you will discover
that it is quite tricky to work with non-Cygwin processes (including
MinGW's gcc) due to differences on directory structures, I/O, process
control, etc.
  

I've been using MSYS' for slightly over a decade now. It's not nearly as tricky as you imagine, aside from not having *NIX fork() and the occasional adjustments needed to deal with confused configure scripts. Just remember to run configure and make from *within* bash (MSYS-3.1); things are much worse with sh ./configure from the Windows command shell, than ./configure within bash.

Best regards,
Kenneth

The reason CMake won't build "out of the box" for me, is that it's *misapplying* the CygWin workarounds to MinGW -- and then refusing to write out something for me to fix up.

So just to add to this conversation, I *too* was thinking about
rewriting the build system :slight_smile: I've been writing a new build system
from the ground up, and I was hoping to see how my system would work
against llvm. I wanted to make sure that I understand the licensing
issues. I'm releasing my build system using just a standard BSD
license, which is compatible with llvm's license, right? I just have
to make sure any code I use from llvm's autoconf directory is properly
attributed?

Anyway, if anyone is interested in yet-another-buildsystem, I've got
the (pre-alpha) code up here:

http://git.felix-lang.org/?p=fbuild.git;a=summary

It's designed to be used for cross compiling, configuration, and being
really simple to extend, since it's written in python (3.0b2). It's
also only ~3500 lines, which is way smaller than any other build
system I've seen for what it supports. I'll be on #llvm and #felix on
freenode (as erickt) if anyone actually wants to talk about it.

Yes.

-Chris

[As this is turning off-topic, feel free to switch to private email]

Kenneth Boyd <zaimoni@zaimoni.com> writes:

Note that the official MinGW GCC binaries generally are not
bootstrapped; they're cross-compiled (presumably from CygWin). [In
particular, both the 3.4.5 and 4.2.1 MinGW binaries of gcc are not
bootstrapped.] I use MinGW rather than CygWin for political reasons, so
running DejaGNU under CygWin isn't a real option for me.

If politics enter the game, this becomes too much complex for me, sorry.

I've been using MSYS' for slightly over a decade now.

Is MSYS so old? IIRC its existence was first made public on 2001. At the
time I was an active MinGW-ist. (I'm getting old... :slight_smile:

It's not nearly as tricky as you imagine, aside from not having *NIX
fork() and the occasional adjustments needed to deal with confused
configure scripts.

As mentioned above, MSYS is a Cygwin fork, tweaked for MinGW
friendliness. However, as you point out, MinGW developers prefer to
build from Cygwin. Maybe this comes from the previous MinGW developers
(pre-MSYS) who used Cygwin extensively, maybe there are technical
reasons for doing so.

[snip]

The reason CMake won't build "out of the box" for me, is that it's
*misapplying* the CygWin workarounds to MinGW -- and then refusing to
write out something for me to fix up.

I'm having success building the LLVM libraries with CMake 2.6.1 on
MSYS, where 2.6 failed, IIRC. Maybe this was fixed.

Óscar Fuentes wrote:

[As this is turning off-topic, feel free to switch to private email]
  

Agreed; let's see if my local ISP's DNS is working now.

Kenneth Boyd <zaimoni@zaimoni.com> writes:

I've been using MSYS' for slightly over a decade now.
    
Is MSYS so old? IIRC its existence was first made public on 2001. At the
time I was an active MinGW-ist. (I'm getting old... :slight_smile:
  

That far back, my memory is fuzzy. One of my internal C++ projects was forced into it no later than early 2000, as it wouldn't compile on any other C++ compiler.

[snip]

The reason CMake won't build "out of the box" for me, is that it's *misapplying* the CygWin workarounds to MinGW -- and then refusing to write out something for me to fix up.
    
I'm having success building the LLVM libraries with CMake 2.6.1 on
MSYS, where 2.6 failed, IIRC. Maybe this was fixed.
  

I will double-check, then. (2.6.0 was the most recent available when I downloaded to test before commenting).

Best regards,
Kenneth