Was this an unintended consequence of the Args switch to StringRef's?

memory write's argument ingestion was changed as part of the StringRefifying of Args so that we get:

(lldb) memory write &buffer 0x62
error: '0x62' is not a valid hex string value.

That seems unexpected and not desirable. What's going on is that the default format is hex, and if the format is hex, the command also supports:

(lldb) memory write -f x &buffer 62
(lldb) fr v/x buffer[0]
(char) buffer[0] = 0x62

The StringRef version of the args parsing is:

      case eFormatDefault:
      case eFormatBytes:
      case eFormatHex:
      case eFormatHexUppercase:
      case eFormatPointer:
        // Decode hex bytes
        if (entry.ref.getAsInteger(16, uval64)) {

The problem is that passing "0x62" to getAsInteger with a radix of 16 rejects "0x62".

We do want to hint the radix. But it seems weird to reject an explicit indicator. Is there some clever way to use the StringRef functions to get the desired effect, or do I have to hack around this by manually stripping the 0x if I see it?


That does seem unintentional. Is there any particular reason we want to hint the radix? If you’re going to specify 0x anyway, then what value does adding the 16 to the getAsInteger add?

I think somebody was being too clever. The real purpose for specifying the format is to set the byte size of the item written down:

    OptionValueUInt64 &byte_size_value = m_format_options.GetByteSizeValue();
    size_t item_byte_size = byte_size_value.GetCurrentValue();

and to do some special magic for c-strings and the like. Somebody (not me) must have thought overloading "hex" format to mean "you can drop the 0x" was a tasty little side benefit.

But it's been that way forever, and I don't know who's relying on that behavior, so I'd rather not change it (intentionally) if we don't have to.


What if you write this:

// Explicitly try hex. This will catch the case where there is no 0x prefix.
if (entry.ref.getAsInteger(16, uval64)) {
// If that fails, let the algorithm auto-detect the radix. This will catch the case where there is a 0x prefix.
if (entry.ref.getAsInteger(0, uval64)) {

Would that work? Alternatively, we could probably update the implementation of getAsInteger in LLVM to you to manually hint the radix AND prefix the number, as long as the prefix matches the hint.

It does seem unfortunate that getAsInteger works differently from strtol's documented behavior in this little way... Makes conversions like this one trickier than necessary. But presumably somebody had a reason for it to be different?

Anyway, I have to go see if this feature of strtol & Co. is used in more places than CommandObjectMemoryWrite. If it was a common idiom then it might be nice to have direct support for it. But if it's just in one place, then hacking around it like this seems fine.


base of 16 was only used in a few places. It's used frequently in the gdb remote protocol code, but I don't think the gdb remote protocol ever sends hex numbers with the 0x prefix, it assumes you know the radix, so that should be okay. Other than that this was the only consequential usage. So I went with detecting the 0x by hand and treating that separately (r299609).

Why does getAsInteger return false if it succeeds in getting an integer? That just seems weird to me.


It is weird, I think everyone hates it. One day I’ll try to change it, but since it’s used so widely, it’s a risky change.

If it helps, you can think of the return value as a std::error_code, where the operator bool returns true if there’s an error. Like

if (std::error_code EC = foo())

Is pretty intuitive, it’s just weird with bool.

BTW, Jason says that sometimes the gdb remote protocol does send 0x prefaced numbers. Apparently not in the cases which have been changed to use getAsInteger with a radix of 16, but that is something we'll have to keep in mind.


Honestly i think it seems reasonable for getAsInteger to just work in this case, so I’ll try to add it to llvm

If you can get that in, that would be great!