Is it possible to select UnknownArch through command line in clang?

Hi all,

Quick background story: I would like to select
DefaultTargetCodeGenInfo when generating LLVM IR for a project of
mine. And as far as I could figure out, that could be accomplished by
selecting UnknownArch as the target architecture.

What I could not figure out was how to set UnknownArch from the
command line using --target or something like that.

Invoking clang like this:
clang --target=x86_64-unknown-linux-gnu simple.c
is fine.

But for this:
clang --target=unknown-unknown-linux-gnu simple.c

I get an error:
error: unknown target triple 'unknown-unknown-linux-gnu', please use
-triple or -arch

It would really be awesome if UnknownArch could somehow be selected
from the command line. Is that possible? Am I missing something?

Thanks!
-- Zvonimir

This might be me answering the question you're trying to ask, but not the one you're specifically asking.... does '-S -emit-llvm' do what you want to do?

Jon

Not really...

My question is a follow-up on this:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/072986.html

I figured out how to turn off coercion by hacking into clang and
adding one line that selects DefaultTargetCodeGenInfo (which does not
do coercion) by default. Now, this is a hacky solution that requires
distributing this weird patch for clang with my tool (SMACK verifier).

So I wonder if the same thing could somehow be achieved through
command line. I tried something like:
clang --target=unknown-unknown-linux-gnu simple.c
but that throws an error.

Thanks!

Looking at Driver.cpp, passing unknown as Arch is treated as an error. Not sure if hacking this check would solve your problem or explode later.

Ok, I invested some time looking into this issue...

Here is the update I propose in clang/lib/basic/Targets.cpp:
static TargetInfo *AllocateTarget(const llvm::Triple &Triple) {
  llvm::Triple::OSType os = Triple.getOS();

  switch (Triple.getArch()) {
  default:
    return NULL;

  // Add this case to support UnknownArch
  case llvm::Triple::UnknownArch:
    return new X86_64TargetInfo(Triple);

  case llvm::Triple::xcore:
    return new XCoreTargetInfo(Triple);

Please comment... This works fine on my regressions, and I suspect it
should work fine on clang regressions too.

Now, is there any chance this update could be pushed into the main
clang trunk so that unknown architecture is supported?

This is really important to us, and releasing a patch for clang with
SMACK verifier would be a very inconvenient solution.

Thanks a lot!

Best,
-- Zvonimir

I don’t see how Clang can support targeting an unknown ISA. How is it supposed to lower sizeof(void*) without real data layout information? Or do any record layout at all? The calling convention stuff is really just part of it. Is there any reason you can’t use the le32 target, or whatever PNaCl uses? Why is byval struct expansion a problem for SMACK?

Ok, yes, we should probably take a step back...

What SMACK currently does not like is type coercion when for example
{i32,i32} is packed into i64 before a function call, and so on. And
the reason is that this often requires byte-level reasoning that is
not supported in SMACK, and on top of that supporting it would almost
certainly add a performance overhead.

So what I learned after poking around clang source is that enabling
DefaultTargetCodeGenInfo solves this problem since it disables most of
such coercions. After learning that, I ventured into figuring out how
to select DefaultTargetCodeGenInfo from command line, and it appears
that is not possible. Is that right?

Now, I got to UnknownArch by tracking the source code back from where
DefaultTargetCodeGenInfo gets selected. If I remember correctly, clang
will select DefaultTargetCodeGenInfo if UnknownArch is selected in the
Triple. So then I started asking how to select UnknownArch...

One final question: why does DefaultTargetCodeGenInfo even exist if it
cannot be selected?

Thanks!

Ok, yes, we should probably take a step back...

What SMACK currently does not like is type coercion when for example
{i32,i32} is packed into i64 before a function call, and so on. And
the reason is that this often requires byte-level reasoning that is
not supported in SMACK, and on top of that supporting it would almost
certainly add a performance overhead.

Sometimes this is desirable because it more closely models in LLVM IR what
is actually happening on x86_64.

So what I learned after poking around clang source is that enabling
DefaultTargetCodeGenInfo solves this problem since it disables most of
such coercions. After learning that, I ventured into figuring out how
to select DefaultTargetCodeGenInfo from command line, and it appears
that is not possible. Is that right?

Now, I got to UnknownArch by tracking the source code back from where
DefaultTargetCodeGenInfo gets selected. If I remember correctly, clang
will select DefaultTargetCodeGenInfo if UnknownArch is selected in the
Triple. So then I started asking how to select UnknownArch...

This all makes sense. :slight_smile:

One final question: why does DefaultTargetCodeGenInfo even exist if it
cannot be selected?

DefaultTargetCodeGenInfo is essentially an abstract base class for other
ABIs.

I think you should do something like what PNaCl does and add a new virtual
ISA like "smack64" or "le64" that has a known data layout, endianness, etc,
and always uses 'byval' when passing structs by value.

This is a great suggestion, thanks!
Is there any chance that something like smack64 would be integrated
into the main trunk of clang?

Since the main reason why I started this discussion was to hopefully
come up with a solution that would ultimately not require a patch to
clang to be applied whenever somebody would like to use it with SMACK.

Thanks!
-- Zvonimir

Hi guys,

So I created two tiny patches for LLVM and clang that allow target
triples of the form x86_64-unknown-smack to be used, which in turn
disables various type coercions.

I tested the patches on both LLVM and clang regressions suites, and
they seem to be working fine.

Please let me know how to proceed with this.

Thanks!

Best,
-- Zvonimir

clang-smack.patch (1.16 KB)

llvm-smack.patch (1.07 KB)

Your change seem reasonable to me but I have no say. I suggest you send this to cfe-commits for official review process.

Nikola