Kaleidoscope tutorial: comments, corrections and Windows support

Hi,

I'm currently working my way through the tutorial with LLVM 3.9.1 on Windows (finished chapter 4) and stumbled over a few things which could be improved:

  - "LLVMContext" does not exist as a variable -> "TheContext"
    - Chapter 3: 5 times
    - Chapter 4: 1 time
    - Chapter 5: 4 times
    - Chapter 6: 2 times
    - Chapter 7: 2 times

3.4. Function Code Generation

  - Wouldn't it make sense to have a
    "Type *DoubleTy = Type::getDoubleTy(TheContext);"?
  - "Function *TheFunction = TheModule->getFunction(Proto->getName());"
    The new getName method is unexpected, the tutorial user has to lookup
    and copy the definition from the full listing.

3.5. Driver Changes and Closing Thoughts

  - "When you quit the current demo"
    -> "When you quit the current demo (by sending an EOF
    via CTRL+D on Linux or CTRL+Z and ENTER on Windows)"
    (Windows users normally don't send EOF characters so they
    usually don't know this shortcut)

4.3. LLVM Optimization Passes

  - Code for InitializeModuleAndPassManager:
    - Remove "Context LLVMContext;"
    - "TheJIT" comes out of nowhere and thus is not defined in the code
      of the tutorial user. Setting the data layout could be added in
      4.4, as it is at least not required for showing the improved body
      of "test"
    - "createBasicAliasAnalysisPass" does not exist anymore (llvm-3.9.1).
      Also requires adaption of the text

4.4. Adding a JIT Compiler

  - Code for main:
    - "TheJIT = llvm::make_unique<KaleidoscopeJIT>();" will cause a crash
      due to an invalid read in "Datalayout::operator=" as
      "EngineBuilder().selectTarget()" returns a nullptr when
      "InitializeNativeTarget" has not been called yet.
      Other problems occur if the other InitializeNative* functions are
      not called. So you should add them here.
    - The (silently added) "TheModule->print" call from chapter 3
      is missing

  - "The KaleidoscopeJIT class is a simple JIT built specifically for
     these tutorials.":
    It would be nice to mention, where to find the include file:
    llvm-src/examples/Kaleidoscope/include/KaleidoscopeJIT.h

  - "auto ExprSymbol = TheJIT->findSymbol("__anon_expr");":
    On Windows, this will always fail as this function will search for
    "ExportedSymbolsOnly" and the exported flag seems to be implemented
    incorrectly for COFF in LLVM (see bug report from Russell Wallace:
    http://lists.llvm.org/pipermail/llvm-dev/2016-April/097964.html ).
    Applied to the current version of Kaleidoscope, he suggested to use
    "CompileLayer.findSymbolIn(H, Name, false)" in
    "KaleidoscopeJIT::findMangledSymbol" to also allow non-exported
    symbols.
    I fixed it by changing "COFFObjectFile::getSymbolFlags" in
    llvm-src/lib/Object/COFFObjectFile.cpp
    to additionally OR SymbolRef::SF_Exported to Result:

uint32_t COFFObjectFile::getSymbolFlags(DataRefImpl Ref) const {
  COFFSymbolRef Symb = getCOFFSymbol(Ref);
  uint32_t Result = SymbolRef::SF_None;

  if (Symb.isExternal() || Symb.isWeakExternal())
    Result |= SymbolRef::SF_Global | SymbolRef::SF_Exported;

    Works and sounds reasonable to me, but maybe you can prove me wrong.

  - "ready> sin(1.0);": On Windows, this fails with "LLVM ERROR: Program
    used external function '_sin' which could not be resolved!" because
    "DynamicLibrary::SearchForAddressOfSymbol" iterates over all loaded
    libraries and uses GetProcAddress to try getting the address.
    But for the old "msvcrt.dll" as well as the newer "ucrtbased.dll",
    the math functions are exported without a leading underscore.
    Additionally, LLVM provides some "explicit" symbols which would
    otherwise only exist in form of a macro (see comment in
    llvm-src/lib/Support/Windows/DynamicLibrary.inc). These are also only
    available in unmangled form, for example "sinf".
    Thus I propose to add this code at the end of
    "KaleidoscopeJIT::findMangledSymbol":

#ifdef LLVM_ON_WIN32
    // For Windows retry without "_" at begining, as RTDyldMemoryManager uses
    // GetProcAddress and standard libraries like msvcrt.dll use names
    // with and without "_" (for example "_itoa" but "sin").
    if (Name.length() > 2 && Name[0] == '_')
      if (auto SymAddr = RTDyldMemoryManager::getSymbolAddressInProcess(
            Name.substr(1)))
        return JITSymbol(SymAddr, JITSymbolFlags::Exported);
#endif

    return nullptr;
  }

  - "ready> sin(1.0);
     Read top-level expression:
     define double @2() {
     entry:
       ret double 0x3FEAED548F090CEE
     }

     Evaluated to 0.841471":
    Actually "sin" is not called at all, LLVM knew "sin" and just returns
    the already const-folded value.
    In my case, I do see a
    "%calltmp = call double @sin(double 1.000000e+00)",
    though, inspite of the same optimizations as in your listing...

  - "ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x);
     Read function definition:
     define double @foo(double %x) {
     entry:
       %calltmp = call double @sin(double %x)
       %multmp = fmul double %calltmp, %calltmp
       %calltmp2 = call double @cos(double %x)
       %multmp4 = fmul double %calltmp2, %calltmp2
       %addtmp = fadd double %multmp, %multmp4
       ret double %addtmp
     }":
    In my case, the CSE does not work, although it does work for the
    example in 4.3:
     "ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x);
     > Read function definition:
     define double @foo(double %x) {
     entry:
       %calltmp = call double @sin(double %x)
       %calltmp1 = call double @sin(double %x)
       %multmp = fmul double %calltmp, %calltmp1
       %calltmp2 = call double @cos(double %x)
       %calltmp3 = call double @cos(double %x)
       %multmp4 = fmul double %calltmp2, %calltmp3
       %addtmp = fadd double %multmp, %multmp4
       ret double %addtmp
     }"

  - "extern putchard(x); putchard(120);":
    Again, on Windows, this doesn't work. To make it work, you have to
    use "extern "C" __declspec(dllexport) double putchard(double X)"
    so the function will be found via GetProcAddress for the main module.
    This wouldn't hurt for Linux / Mac, would it?

Best regards
Moritz Kroll

Any chance you would submit a patch to fix all this?

http://llvm.org/docs/DeveloperPolicy.html
http://llvm.org/docs/Phabricator.html

Thanks,

Hi Mehdi,

well, some comments are actually questions, but I'll see what I can do.
For Chapter 5, I'm already changing the tutorial directly based on svn trunk.

Best regards
Moritz

Hi Mehdi,

I submitted a patch:

[PATCH] D29864: Update Kaleidoscope tutorial and improve Windows support

https://reviews.llvm.org/D29864

An open question: Why doesn't CSE work for sin and cos as shown at the end of section 4.4?
It looks like, they are not recognized as intrinsics and don't get the according attributes in Function::Function.
Function::recalculateIntrinsicID requires the function name to start with "llvm.".
Maybe this worked in a previous version of LLVM?

Optimizing sin and cos is specifically mentioned in AliasAnalysis.h:

llvm-3.9.1.src\include\llvm\Analysis\AliasAnalysis.h:

   /// Checks if the specified call is known to never read or write memory.
   ///
   /// Note that if the call only reads from known-constant memory, it is also
   /// legal to return true. Also, calls that unwind the stack are legal for
   /// this predicate.
   ///
   /// Many optimizations (such as CSE and LICM) can be performed on such calls
   /// without worrying about aliasing properties, and many calls have this
   /// property (e.g. calls to 'sin' and 'cos').
   ///
   /// This property corresponds to the GCC 'const' attribute.
   bool doesNotAccessMemory(ImmutableCallSite CS) {
     return getModRefBehavior(CS) == FMRB_DoesNotAccessMemory;
   }

Best regards
Moritz

Hi Mehdi,

I submitted a patch:

[PATCH] D29864: Update Kaleidoscope tutorial and improve Windows support

https://reviews.llvm.org/D29864

Thanks!

An open question: Why doesn't CSE work for sin and cos as shown at the end of section 4.4?

CSE works on functions marked as “readnone” , the JIT does not does add the attributes automatically.

My guess is that it is the responsibility of the frontend to either emit a call to the intrinsic or call a function sin() with the proper set of attributes.