clang -fms-extensions mode, tasks to do.

For people interesting in improving the -fms-extensions mode of clang,
here is a non complete list of tasks to do.
Those are some of the extensions that must be added to clang in order
to parse the Windows SDK header files bundled with VS 2008.
I plan to address every one of those in roughly that order but if you
want to help that's more than welcome because
the amount of time I can put on clang is (unfortunately) limited.

BTW those are just parsing issues (--fsyntax-only).

MSVC issues:

1. clang must parse this:
   extern "C++" template<class T> f(T a);

2. clang must parse this:
     0x6cb9a43e+(1), <== hexadecimal literal lexing issue

3. type_info predeclared, IUnknown predeclared
  add "class type_info;struct IUnknown;" in MacroBuilder?

4. __pragma support
5. __uuidof opertor

6. flexible array member in union

typedef struct _PROPERTYINSTEX
    WORD Length;
        BYTE Byte[];
        WORD Word[];
        DWORD Dword[];
        LARGE_INTEGER LargeInt[];
        SYSTEMTIME SysTime[];
        TYPED_STRING TypedString;

7. clang must parse this: (superfluous A::slight_smile:
     class A {
         int A::f() { return 0; }

8. clang must parse this:
     enum TATA {
          a = (TATA) 3
9. MSVC Compiler Intrinsics
    ex: __noop


I found a bug in the types produced by the clang front end when using different classes that produce the same LLVM type (e.g. pure virtual classes:
%class.vBase1 = type { i32 (...)** }
%class.vBase2 = type { i32 (...)** }

If an object of type vBase1 is instantiated before using vBase2 then clang will produce a bitcast to vBase1 even if vBase2 is required (line 57 of type_error.s:
%2 = bitcast %class.UES* %this1 to %class.vBase1*, !dbg !47 ; <%class.vBase1*> [#uses=1]).

I've attached a small example that reproduces the error using the current svn rev 11805.

I tried to figure out why it gos wrong as well, but quite frankly it's pretty in-transparent where/how the type for the cast that is incorrect is stored. How can you retrieve the type name (like class.vBase1) from a llvm::Type instance? The only thing I can get is "{ i32 (...)** }" which is not very helpful at all....

As a side note: llvm is a pain in the a** to debug because it's using all these "super efficient" ...tricks... to store things (small vector, lsb of pointers etc). It's pretty much impossible to inspect a data structure and actually get something usefully out of that (yes, you can do a call Obj->dump() in gdb..that's not nearly as useful!). Since I don't care about speed in a debug build at all it would be neat of these tricks were used only for release builds and for debug builds some slow but plain data structures were used.

it'd be great if someone who knows the type system code better than me could help me to figure this out and maybe explain to me how the type name is actually recovered from an llvm::Type instance.

Should I open a bug tracker entry somewhere or such like??


type_error.cpp (339 Bytes)

type_error.s (15.5 KB)

No you didn't, you found the expected behaviour. LLVM types are structural types, they are not intended to encapsulate all of the source language type information (this can be attached to metadata, if required, and is in the debug info). If you source language types have the same layout, they MUST have the same LLVM type, or a number of optimisations would fail.

Remember what the LL stands for in LLVM.


-- Sent from my Apple II

Thanks for making this list. I've put it up on the wiki (along with
other things):

As the page says, feel free to add, remove, modify, make better
(including the name of the article, but that needs to be done sooner
rather than later if it is to be done).

- Michael Spencer

Hm, looking into the assembly file I atteched there are annotations for the type names (e.g. class.vBase1) which are incorrect. Does that not qualify as a bug/problem? I'm aware of the debug info. Still the (bit)cast generated by clang casts unrelated pointers....


The type names don't matter; only the structure matters. That's how LLVM IR is defined, and you're trying to read too much into the type names that are there in the textual form of the IR.

  - Doug