A friendly question

Not sure there will be enough use for it but I was wondering whether there would be enough interest in making llvm-based gnucobol project clone (possibly launching summer of code project for it).

For now it works in such a way that it processes cobol sourcecode and produces source in C.

In such way we could try to beat both gnu cobol and ibm releasing lately cobol compiler on x86 linux (if im not mistaken wrt to OS).

Best regards,
Pawel Kunio


I've renamed the thread so that interested folks might see it.

I think there are a bunch of somewhat conflated here and it's probably worth unpicking them a bit. They seem to be:

  - Is anyone interested in writing a COBOL front end for LLVM?
  - Would a COBOL front end be considered for integration with the LLVM repository?
  - Would projects to work on an LLVM COBOL front end be considered for the GSoC or similar?

I can take a stab at answering these:

  - There are almost certainly some people interested in COBOL. Most folks on this list probably don't care about Fortran, but that hasn't prevented flang from being written and integrated into the repository. Some of the IBM people might be interested. I think the main question is what the differentiation is from gnucobol. From a quick skim, gnucobol translates COBOL to C, so you can presumably already use it with clang and get an LLVM back end. Writing a new front-end is a nontrivial task and so you'd have to explain what the benefit is of a new one. Most of the codegen-related benefits don't apply (presumably you can already use clang + gnucobol to compile COBOL to IR and do LTO with it and C/C++/Fortran/Whatever code). Would you want to reuse Clang's diagnostic infrastructure and provide better error messages? Is it just a question of the license?

  - There's a difference of opinion in the community about the importance of being in-tree. I'd personally prefer that clang and flang were both moved out of the monorepo to force us to think more about library interfaces. The more LLVM developers are also working on out-of-tree projects, the better our libraries become. That said, if you want a project to be considered for eventual inclusion in the tree, then making sure that the license is the same as the rest of the project and that the coding style is the same is a good first step. You can then defer this decision until the front end is more mature and you see some clear benefit in being upstream.

  - For a project to be considered for the GSoC, the only real requirement is finding someone willing to mentor it. That means that you need to find someone who is an existing LLVM contributor who is interested in COBOL and you need to provide a good answer to the 'why isn't gnucobol good enough?' question.


I think there are a bunch of somewhat conflated here and it's probably
worth unpicking them a bit. They seem to be:

  - Is anyone interested in writing a COBOL front end for LLVM?
  - Would a COBOL front end be considered for integration with the LLVM
  - Would projects to work on an LLVM COBOL front end be considered for
the GSoC or similar?

Actually my first reaction was, retarget gnucobol to emit LLVM IR instead
of C, and run the LLVM backend on it. (I'd speculated to myself about
doing this as a fun post-retirement project.) There's probably licensing
questions to resolve there first. I could see compilation speed getting
better as a result, although runtime performance is unlikely to be better
than routing through C. Not clear what the effect would be on debugging;
it looks like there are a couple of packages out there that let you debug
COBOL using the existing gnucobol setup, and gdb doesn't understand COBOL
so if anything the debugging situation is likely to get worse.

I've retargeted a COBOL front-end in the past (not to LLVM), and I'd
estimate that it is significantly more work than one GSoC project.
(Not that I've actually looked inside gnucobol to see how it works.)

As for who's interested in a COBOL front-end:
The OpenVMS people (vmssoftware.com) are definitely headed in the direction
of a native LLVM-speaking COBOL frontend, although I'm sure it would remain
proprietary. That direction would presumably include adding DIBuilder
features for the COBOL data types, and getting LLVM to emit the proper DWARF
descriptions. Haven't seen any signs of that happening upstream, though.


Yes, we have our Digital legacy COBOL frontend hooked to LLVM. That
frontend generates our legacy GEM IR which is then converted to LLVM IR.
It is currently an Itanium-hosted cross-compiler but we're bootstrapping
our compilers to native OpenVMS x86 right now (we have clang "working"
on OpenVMS x86 on Virtual Box today).

The frontend (and much of the companion library to process the DEC4/DEC8
datatypes) still has Digital copyrights which are own owned by HPE and
licensed to us. I would be unable to opensource it without their
permission. And you'd get a nice vintage COBOL 85 compiler written in
BLISS. :slight_smile: :slight_smile: :slight_smile:

As for the DIBuilder COBOL support, since our cross-compilers are based
on an ancient LLVM 3.4.2 (due to the ancient Itanium C++ we have on our
host systems), we have to refresh all of that with our native
bootstrapping before I could even consider upstreaming any of that. And
we are just starting on our symbolic debugger so I don't know if
anything we've done even works yet. And I haven't even explained level
88 condition names to the debugger engineers yet. :slight_smile:

For those keeping score at home, what we have so far is our legacy
compilers for BASIC, BLISS, C, COBOL, Fortran95, Macro-32 VAX assembly,
and Pascal. All but BASIC are in good shape. BASIC and its RTL do some
un-natural acts. And now we just bootstrapped clang 10 (we had to pick
something to start) by compiling on Linux using a mixture of OpenVMS and
Linux headers and then moving the objects to OpenVMS for linking (using
the OpenVMS linker of course).

This e-mail (including any attachments) may contain privileged, confidential, proprietary, private, copyrighted, or other legally protected information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient (even if the e-mail address above is yours), please notify us by return e-mail immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.

This “conflict” should now be somewhat reduced with the introduction of incubation projects, which is more official than a “user” project but less than in the monorepo.

As a first step, a GSoC project could create a side-project seed (on a private repo) with some support for the language, using close to trunk LLVM.

Later, if the community picks up, there could be enough momentum for an incubation project in the LLVM Github repo, where more people collaborate and is under our management.

Only if the project really takes off and has a large subset of our community caring about it, and frequently testing large amounts of code against trunk, that it should be considered in the monorepo.

Not small steps, but smaller than in the SVN times… :smiley:


Interested in works on pascal.

Best regards,
Pawel Kunio

czw., 6.05.2021, 17:54 użytkownik John Reagan <john.reagan@vmssoftware.com> napisał: