Question about an error we're now starting to get on LLVM 3.8.0rc2since

+llvm-dev

Duncan,
        Kevin Harris here, from Unisys. In our application, generating LLVM IR, we have several instances of code that looks like this:

        . . .
        Value* pDS;
        . . .
            auto argIt = pFunc->arg_begin();
            pDS = argIt;
        . . .

        This construct works fine in 3.7, but in LLVM 3.8.0rc2, I’m now getting an error:

g++-5.2.0 `/home/kharris/dyntrans/llvm-install/bin/llvm-config --cxxflags` -D JITREV="\"4793\"" -D GPPVERSION="\"5.2.0"\" -D LLVMVERSION=38 -D LLVMBUILDVERSION="\"3.8.0"\" -D LLVMBUILDTYPE="\"Debug+Asserts"\" -fno-rtti -Wall -Wextra -Wshadow -c -fmessage-length=0 -fPIC -std=c++11 -o "Debug/dt_llvm.o" "Debug/../dt_llvm.cpp" -I ../MarinerIP -O3 -g3 -m64 -march=core2 -MMD -MP -MF "Debug/dt_llvm.d" -MT "Debug/dt_llvm.d"
Invoking: g++-5.2.0 Compiler

Debug/../dt_llvm.cpp: In static member function âstatic llvm::Function* BREGS::selectBasicBreg(Access_t)â:
Debug/../dt_llvm.cpp:1772:17: error: cannot convert âllvm::ilist_iterator<llvm::Argument>â to âllvm::Value*â in assignment
             pDS = argIt;
                 ^

You probably want `pDS = &*argIt`, which makes the conversion from a
possibly-not-dereferenceable-iterator to a pointer explicit.

        Evidently, the result of the call to pFunc->arg_begin(); is no longer convertible to a “Value*” type in 3.8. I checked on the type of an Argument, it is still derived from a Value type, so that isn’t the problem. But I discovered that an “ArgumentListType” has indeed changed. In 3.7, in /include/IR/Function.h, this was declared as:

  typedef iplist<Argument> ArgumentListType;

whereas in LLVM 3.8.0rc2, this is changed to:

  typedef SymbolTableList<Argument> ArgumentListType;

        Rummaging around in the svn log for Function.h, I discovered that this change occurred at svn rev 249602, which you annotated as:

IR: Create SymbolTableList wrapper around iplist, NFC

Create `SymbolTableList`, a wrapper around `iplist` for lists that
automatically manage a symbol table. This commit reduces a ton of code
duplication between the six traits classes that were used previously.

As a drive by, reduce the number of template parameters from 2 to 1 by
using a SymbolTableListParentType metafunction (I originally had this as
a separate commit, but it touched most of the same lines so I squashed
them).

I'm in the process of trying to remove the UB in `createSentinel()` (see
the FIXMEs I added for `ilist_embedded_sentinel_traits` and
`ilist_half_embedded_sentinel_traits`). My eventual goal is to separate
the list logic into a base class layer that knows nothing about (and
isn't templated on) the downcasted nodes -- removing the need to invoke
UB -- but for now I'm just trying to get a handle on all the current use
cases (and cleaning things up as I see them).

Besides these six SymbolTable lists, there are two others that use the
addNode/removeNode/transferNodes() hooks: the `MachineInstruction` and
`MachineBasicBlock` lists. Ideally there'll be a way to factor these
hooks out of the low-level API entirely, but I'm not quite there yet.

        Since this change is annotated as NFC, I’m thinking that the actual cause of my error is some other (earlier) change to the way ArgumentListType is derived, but I haven’t found it yet, so I thought I’d bug you first.

You're looking for r252380. Relevant excerpt:

    Note: if you have out-of-tree code, it should be fairly easy to revert
    this patch downstream while you update your out-of-tree call sites.
    Note that these conversions are occasionally latent bugs (that may
    happen to "work" now, but only because of getting lucky with UB;
    follow-ups will change your luck). When they are valid, I suggest using
    `->getIterator()` to go from pointer to iterator, and `&*` to go from
    iterator to pointer.

        Is this change indeed intended as a visible API change to source code generating references to argument list values? If so, can you point me to a description of how I should change our code? Should I bug someone else about this problem? Should this API change be documented in the 3.8.0 release notes?

Release notes is a good idea; somehow I never think of that. Hans, is it
too late?

From: dexonsmith@apple.com [mailto:dexonsmith@apple.com]
Subject: Re: Question about an error we're now starting to get on LLVM 3.8.0rc2since

> Debug/../dt_llvm.cpp:1772:17: error: cannot convert âllvm::ilist_iterator<llvm::Argument>â
> to âllvm::Value*â in assignment
> pDS = argIt;
> ^

You probably want `pDS = &*argIt`, which makes the conversion from a
possibly-not-dereferenceable-iterator to a pointer explicit.

Now that you point out the proper construct, it becomes obvious. Thanks very much.

- Chuck

Not at all. I haven't even started nagging people yet :slight_smile:

Please either commit directly to the release notes on the branch, or
send me an email.

Thanks,
Hans

I wonder if it isn't it a bit to much detailed to put this in the release notes? I mean we change the C++ API all the time...

Here is the level of details we went with in 3.7: http://llvm.org/releases/3.7.0/docs/ReleaseNotes.html#non-comprehensive-list-of-changes-in-this-release

Sure, but I think anything that helps users move to the next version
is valuable. I realize we'll probably never have exhaustive lists of
API changes in the release notes, but every little helps, and if
Duncan wants to write a note for this one, I'll certainly take it.

Thanks,
Hans

If we’re considering adding items to the release notes:

A couple of pain points I encountered when migrating my code from 3.5.1 to 3.7.1. I don’t keep ToT, and thereby avoid the constant API churn, but then I have a bit of work to do when upgrading from one release to another.

Bitcode reader/writer: I have a class that creates a Module*, and optionally preloads it with bitcode read from a file.

In 3.5.1 I used:

ErrorOr<Module *> parseBitcodeFile(MemoryBuffer *Buffer,
LLVMContext &Context);

In 3.7.1 this became:

ErrorOr<std::unique_ptr>
parseBitcodeFile(MemoryBufferRef Buffer, LLVMContext &Context,
DiagnosticHandlerFunction DiagnosticHandler = nullptr);

I had a bit of difficulty figuring out why the unique_ptr<> was used. I thought the purpose of a unique_ptr<> was to ensure that some (temporary) object could never be leaked or dangled, by making it very hard to get at the underlying raw pointer. Yet, everywhere else I needed to pass a Module*, and there’s nothing to stop me from dangling it.

I understand we’re not out to create a C++11 tutorial, but major changes in the memory ownership semantics ought to be documented.

Debug metadata generation: LOTS of changes, but the worst:

In 3.5.1 we have:

DICompositeType createReplaceableForwardDecl(
unsigned Tag, StringRef Name, DIDescriptor Scope, DIFile F,
unsigned Line, unsigned RuntimeLang = 0, uint64_t SizeInBits = 0,
uint64_t AlignInBits = 0, StringRef UniqueIdentifier = StringRef());

In 3.7.1 this was changed to:

DICompositeType *createReplaceableCompositeType(
unsigned Tag, StringRef Name, DIScope *Scope, DIFile *F, unsigned Line,
unsigned RuntimeLang = 0, uint64_t SizeInBits = 0,
uint64_t AlignInBits = 0, unsigned Flags = DINode::FlagFwdDecl,
StringRef UniqueIdentifier = “”);

However, simply making the code change and adjusting the arguments leads to an assert. I also had to call replaceTemporary() at the point where the replacement was available. This was not required in 3.5.1. Prior to this point I was unaware of the distinction between uniqued and non-uniqued metadata, nor did I have any reason to care.

r260415

Hi David,

Have you started looking at 3.8 yet? Your email mentions issues with
3.7, but the ship for updated those release notes sailed when it was
released.

Cheers,
Hans

From: mehdi.amini@apple.com [mailto:mehdi.amini@apple.com]
Sent: Wednesday, February 10, 2016 1:51 PM
To: Duncan P. N. Exon Smith
Subject: Re: [llvm-dev] Question about an error we're now starting to get on LLVM 3.8.0rc2since

You're looking for r252380. Relevant excerpt:

   Note: if you have out-of-tree code, it should be fairly easy to revert
   this patch downstream while you update your out-of-tree call sites.
   Note that these conversions are occasionally latent bugs (that may
   happen to "work" now, but only because of getting lucky with UB;
   follow-ups will change your luck). When they are valid, I suggest using
   `->getIterator()` to go from pointer to iterator, and `&*` to go from
   iterator to pointer.

       Is this change indeed intended as a visible API change to source code generating references to argument list values? If so, can you point me to a description of how I should >>> change our code? Should I bug someone else about this problem? Should this API change be documented in the 3.8.0 release notes?

Release notes is a good idea; somehow I never think of that. Hans, is it
too late?

I wonder if it isn't it a bit to much detailed to put this in the release notes? I mean we change the C++ API all the time...

Here is the level of details we went with in 3.7: LLVM 3.7 Release Notes — LLVM 3.7 documentation

--
Mehdi

I strongly recommend that C++ API changes, from one release to the next, be documented in the release notes. I venture to say that a large fraction of out-of-tree LLVM users (like us) are exposed to API changes only at release times. I know of no other central place where they are documented. Not all API changes are discussed in llvm-dev. Even when they are, the API changes are often buried in a lot of additional design detail not relevant to the source code of the callers.

To maximize the release candidate testing time for us out-of-tree users, I also recommend introducing any API change documentation as early as possible - in RC1 if possible.

For the 3.7.0 release, it took me the better part of a month to track down and remedy the changes to our sources that were necessary, and by then, it was too late to reflect any of our findings in the final release. We even had to go back and re-implement some fixes that we initially just worked around. We just finished with those when 3.8.0rc2 came out. The initial elapsed time to get 3.7 up and running in our environment would have been only a couple of days, if the API changes had been highlighted early.

In contrast, this change to formal Argument referencing appears to be the only API change that affected us for the 3.8 release. So I doubt that the net burden on the committers is actually significant.

The text of the change documentation need not be large: Just an example of the caller change necessary is usually sufficient, perhaps with a pointer to any prior discussions that motivated the change, and reference to what headers are affected. As Duncan did here.

Thanks for the consideration!
  -Kevin

I think such a shift in the policy would require at least an dedicated RFC on llvm-dev and a change in the developer guide.

This is a great reason to start tracking trunk and maybe even move
to in-tree development ;). Realistically, C++ API changes do
happen all the time, and I think documenting them has to be on a
"best effort" basis. We could certainly be better at this, but I
wouldn't want to move to a model where it was required.

Right now it is not even officially "encouraged". I don't know of any guideline on this (it there any?), and I'm not fond of the "stuff gets randomly in the release notes while other stuff does not without any good reason".
On the topic, I'm not convinced adding minor changes to the C++ API to release notes is a good direction all. I'd rather keep the surface of the release notes limited to high-level changes: how to initialize the compiler, add passes, etc.
Till recently, only the C API changes were documented "best effort" in the release notes (they are now required to be documented there).

From: dexonsmith@apple.com [mailto:dexonsmith@apple.com]
Sent: Wednesday, February 10, 2016 2:38 PM
To: Mehdi Amini
Cc: Harris, Kevin; LLVM Dev
Subject: Re: [llvm-dev] Question about an error we're now starting to get on LLVM 3.8.0rc2since

Release notes is a good idea; somehow I never think of that. Hans, is it
too late?

I wonder if it isn't it a bit to much detailed to put this in the release notes? I mean we change the C++ API all the time...

Here is the level of details we went with in 3.7: LLVM 3.7 Release Notes — LLVM 3.7 documentation

--
Mehdi

I strongly recommend that C++ API changes, from one release to the next, be documented in the release notes. I venture to say that a large fraction of out-of-tree LLVM users (like us) are exposed to API changes only at release times. I know of no other central place where they are documented. Not all API changes are discussed in llvm-dev. Even when they are, the API changes are often buried in a lot of additional design detail not relevant to the source code of the callers.

To maximize the release candidate testing time for us out-of-tree users, I also recommend introducing any API change documentation as early as possible - in RC1 if possible.

For the 3.7.0 release, it took me the better part of a month to track down and remedy the changes to our sources that were necessary, and by then, it was too late to reflect any of our findings in the final release. We even had to go back and re-implement some fixes that we initially just worked around. We just finished with those when 3.8.0rc2 came out. The initial elapsed time to get 3.7 up and running in our environment would have been only a couple of days, if the API changes had been highlighted early.

In contrast, this change to formal Argument referencing appears to be the only API change that affected us for the 3.8 release. So I doubt that the net burden on the committers is actually significant.

The text of the change documentation need not be large: Just an example of the caller change necessary is usually sufficient, perhaps with a pointer to any prior discussions that motivated the change, and reference to what headers are affected. As Duncan did here.

I think such a shift in the policy would require at least an dedicated RFC on llvm-dev and a change in the developer guide.

That sounds reasonable. I'm unfamiliar with the formalities, but I wouldn't mind leading the charge.

This is a great reason to start tracking trunk and maybe even move
to in-tree development ;). Realistically, C++ API changes do
happen all the time, and I think documenting them has to be on a
"best effort" basis. We could certainly be better at this, but I
wouldn't want to move to a model where it was required.

Tracking the trunk is a reasonable suggestion, and I've considered it. But that doesn't obviate the need for API changes to be highlighted and described, in order to understand how to modify the API callers. Sometimes, esp. when a change strongly increases the level of abstraction, it takes a long time to determine how to modify the caller, even when you've already pinpointed the change.

Moving to in-tree development sounds desirable, but would unlikely be of interest to anyone else, and would require the participation of lawyers . . . :slight_smile:

I'm mostly happy with a "best-effort" basis, as long as it doesn't degenerate into a "no-effort" basis, which is what happened with 3.7.
  -Kevin

The "formalities" is pretty limited (I hope...): an email to llvm-dev (starting with something "[RFC]" is preferable) to allow everyone to chime in and reach a consensus.
It is best if can try to explain in the RFC why such a change is a good thing, what should and should not be added, one (or multiple) examples might be welcome, and optionally a patch that introduce the guideline (even on a "best effort") in docs/DeveloperPolicy.rst
(requirement for C API to be documented in the release notes are already in this file)

Recent examples:

http://lists.llvm.org/pipermail/llvm-dev/2016-February/094804.html
http://lists.llvm.org/pipermail/llvm-dev/2016-February/094851.html
http://lists.llvm.org/pipermail/llvm-dev/2016-February/094869.html
http://lists.llvm.org/pipermail/llvm-dev/2016-February/095275.html

Hope this helps,