RFC: AArch64 Linux Memory Tagging Support for LLDB

Hi all,

What follows is my proposal for supporting AArch64's memory tagging
extension in LLDB. I think the link in the first paragraph is a good
introduction if you haven't come across memory tagging before.

I've also put the document in a Google Doc if that's easier for you to
read: https://docs.google.com/document/d/13oRtTujCrWOS_2RSciYoaBPNPgxIvTF2qyOfhhUTj1U/edit?usp=sharing
(please keep comments to this list though)

Any and all comments welcome. Particularly I would like opinions on
the naming of the commands, as this extension is AArch64 specific but
the concept of memory tagging itself is not.
(I've added some people on Cc who might have particular interest)

Thanks,
David Spickett.

<begin doc>

# RFC: AArch64 Linux Memory Tagging Support for LLDB

## What is memory tagging?

Memory tagging is an extension added in the Armv8.5-a architecture for AArch64.
It allows tagging pointers and storing those tags so that hardware can validate
that a pointer matches the memory address it is trying to access. These paired
tags are stored in the upper bits of the pointer (the “logical” tag) and in
special memory in hardware (the “allocation” tag). Each tag is 4 bits in size.

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety

## Definitions

* memtag - This is the clang name for the extension as in
“-march=armv8.5-a+memtag”
* mte - An alternative name for mmtag, also the llvm backend name for
the extension.
  This document may use memtag/memory tagging/MTE at times, they mean
the same thing.
* logical tag - The tag stored inside a pointer variable (accessible
via normal shift and mask)
* allocation tag - The tag stored in tag memory (which the hardware provides)
  for a particular tag granule
* tag granule - The amount of memory that a single tag applies to,
which is 16 bytes.

## Existing Tool Support

* GCC/Clang can generate MTE instructions
* Clang has an option to memory tag the stack (discussed later)
* QEMU support has been merged
* Linux Kernel patches are in progress
  (git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
“devel/mte-v5” branch)
* GDB support is in review and this design takes a lot of direction from that
  (https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/luisgpm/aarch64-mte-v2)
  (originally proposed
https://sourceware.org/pipermail/gdb-patches/2019-August/159881.html)

## New lldb features

Assuming your software is acting correctly, memory tagging can “just work”
without debugger support. This assumes the compiler/toolchain/user are
always correct.

For when that isn’t the case we want to be able to:
* Read/write the logical tags in a pointer
* Read/write the allocation tags assigned to a given area of memory
* Test whether the logical tag in a pointer matches the allocation tag of the
  memory it refers to
* Read/write memory even when tags are mismatched

The most obvious use case for this is working through issues where bugs in the
toolchain don’t generate correct code. On the other hand there’s a good case for
deliberately messing with pointers in your code to prove that such protection
actually works.

Note: potential extensions to scripting such as tags as attributes of values and
such are not being proposed here. Of course the new commands will be
added in the
standard ways so you can use those.

## New Commands

### Command Availability

Note: commands will be listed in tab completion and help regardless of
these checks

* The remote server must support memory tagging packets. lldb will send/check
  for the “memory-tagging” feature in the qSupported packet. (this
name aligns with gdb)
* The process must have MTE available. We check HWCAP2_MTE for this.
* The process must have enabled tagged addressing using prctl
  (see “New Registers” for details)
* The address given must be in a range that has MTE enabled, since you can mmap
  with or without MTE. (this information is in /proc/.../smaps)

#### Interaction With Clang’s Stack Tagging

We’re relying on the kernel to tell us if MTE is enabled, so stack tagging will
not be visible to the debugger this way.
(https://github.com/google/sanitizers/wiki/Stack-instrumentation-with-ARM-Memory-Tagging-Extension-(MTE))

E.g. {int x; use(&x); } where x is void x(int* ptr);
“ptr” will have a memory tag but the kernel won’t know this.

To work around this a setting will be added to tell lldb to assume that MTE is
enabled, so that you can at least see the logical tags of a pointer.
(see “New Settings”)

### General Properties/Errors

* <address expression> must resolve to some value that can be handled as an
  address by lldb. (though it need not be a pointer specifically)
* Tags will be printed in hexadecimal to reflect the fact that they are a 4 bit
  field. (and since tags are randomly generated, ordering is unlikely
to be a concern)
* Packed tags will be 1 tag per byte (matches what ptrace expects)
* Addresses will be rounded down to the nearest granule (not always by lldb
  itself but what the user sees will look like this)
* Ranges are rounded up to a whole number of granules
* It is an error to use a command on an address that does not have MTE enabled.
  (with the exception of “mtag check”)

### Commands

#### Avoiding Architecture Specific Naming

One problem you might see with the commands below is that they use l/a for
logical/allocation tags. These names are specific to MTE, for instance SPARC’s
ADI talks about “versions” instead. This limits the reuse of these
commands in the future.
(https://sourceware.org/gdb/current/onlinedocs/gdb/Sparc64.html#ADI-Support)

Instead we could first put them under “memory”, then merge the a/l tag commands
into “memory showtag” and “memory settag” (check -> checktag,
getconfig -> tagconfig).
Which avoids the arch specific names, though the output will still be.

(lldb) memory showtag <addr> <length in bytes>
<addr>: logical 0x1 allocation: 0x1 0x2 0x3 ...
(lldb) memory settag <addr> <logical tag> <length in bytes> <allocation tags...>

Length and allocation tags would be optional. We could assume that if
we only get
the logical tag arg, we should set both kinds of tag. This accommodates future
systems where there is only one type of tag, or you can only set them
all at once.

Whatever way you do it, there’s some kind of Arch dependent behaviour.

Another option would be to call them the “pointer tag” and the “memory tag”.
(which lends itself to being “memory tag/ptag” not “mtag mtag/ptag”
which is just confusing)

(lldb) memory showptrtag <addr>
(lldb) memory showtag <addr> <length>
(lldb) memory checktag <addr>

This makes the most sense to me and avoids having variable numbers of arguments
to commands.

#### mtag showltag <address expression>

Show the logical tag contained in the address given.

(lldb) mtag showltag a_ptr
0xF

Error conditions:
* As described above

#### mtag setltag <address expression> <tag value expression>

Set the logical tag of the variable that <address expression> resolves to, to
the value <tag value expression> resolves to.

(lldb) mtag setltag a_ptr 0xE

Error conditions:
* Address variable is not writable, e.g ptr+10 we can set a new tag but have
  nowhere to write it back to.
* Tag value is out of 0x0 to 0xF range. (this limit is specific to AArch64)

#### mtag showatag <address expression> <optional length>

Show the allocation tag(s) associated with the granule of memory that
<address expression> points to. (this is reading target memory so the work will
be done in lldb-server)

<length> will default to 1 granule, otherwise you can provide a value in bytes
which will be rounded up to a whole number of granules. E.g 28 bytes becomes 32
bytes which is two granules so two tags.
(note that length of 0 also becomes 1 granule)

(lldb) mtag showatag a_ptr
[0xfffff7ffa000, 0xfffff7ffa010) : 0xE
0xE
(lldb) mtag showatag a_ptr 28
[0xfffff7ffa000, 0xfffff7ffa010) : 0xE
[0xfffff7ffa010, 0xfffff7ffa020) : 0xF

Error conditions:
* General failure to read tag memory on the target (a ptrace failure)
* Failure to read tags because MTE is not enabled
* Given <length> is less than zero

#### mtag setatag <address expression> <length> <tags...>

Set the allocation tags of the memory in range <address expression> to
<address expression> + <length> (where length is rounded up to a whole number of
granules, meaning length <16 = 1 granule) to the tags in <tags>.

Where <tags…> is one or more tag arguments either in hex or decimal. Once these
are validated they will be each packed with 1 byte per tag in the data
sent to lldb-server.

Note: this is a break from the current gdb design that has the user type the raw
bytes. For example:
(gdb) mtag setatag a_ptr 32 040F

This does make the command more flexible as validation is done server side but
we’re doing some validation client side for logical tags anyway. The
question is,
is this added convenience enough to break with gdb?
(though if we go with the alternate “memory …” naming scheme proposed
above, we might as well)

In the example below we’re giving granule 1 at a_ptr a tag of 0x4 and granule 2
at a_ptr+16 a tag of 0xF. The second example sets the tag of the
granule at a_ptr to 0x5.

(lldb) mtag setatag a_ptr 32 0x4 15
(lldb) mtag setatag a_ptr 1 5

In the case that the number of tags given is not enough to cover the
memory range,
lldb-server will keep repeating the set until it does. Meaning a set of 2 tags
would be repeated once to cover 4 granules. A set of 3 tags would be
written once
with the first tag used again for the 4th granule.

Error conditions:
* Length is not a valid number or is less than 0
* One or more tags are out of the valid range of 0-0xF

#### mtag check <address expression>

Check that the logical tag in <address expression> matches the allocation tag
set for the granule it points to.

(lldb) mtag check a_ptr
Failed: logical tag 0x1 does not match allocation tag 0x2
(lldb) mtag check non_mte_ptr
Memory tagging is not enabled for address non_mte_ptr
(lldb) mtag check another_ptr
Passed: logical tag 0x1 matches allocation tag 0x1

Showing tags for a passed check seems redundant but I think it’s good to have as
a shortcut. That way you can use “mtag check” instead of “mtag showltag” then
“mtag showatag” if you want both tags.

Error conditions:
* Standard handling

#### mtag getconfig

This command will read the TAGGED_ADDR_CTRL register (see “New Register”) and
pretty print its values. It's nice to have but certainly isn’t as good as being
able to pretty print a register in general. (which I don’t think is
possible right now)

(lldb) mtag getconfig
Tagged addressing: Enabled
Fault Mode: Synchronous
Included Tags: 0b1111000011110000
(lldb) mtag getconfig
Target process is not MTE enabled.

Formatting up for debate of course, the point is you don’t have to shift things
in your head just to sanity check the debugee’s usage.

Note: no “set” for this at this time as I think that’s going to be a
much rarer occurrence.

## Modified Commands

### memory region

Will use the extra information from the qMemoryRegionInfo packet to show the
VmFlags where possible. For example:

(lldb) memory region addr
[0x00007ffff7ed2000-0x00007ffff7fd2000) rw- /dev/zero (deleted)
flags: rd ex mr mw me dw sd mt

### memory read

Will not check that logical and allocation tags match, allowing reads
regardless.
Since most of the time checking is not the user’s intent when doing a read and
even if it is, there’s “mtag check” for that.

It will show allocation tags for memory that is MTE enabled. This is
on by default
on the basis that some subset of memory will be MTE so if you’re working with it
then tags are probably relevant. (new setting added to control this)

In the ideal scenario this looks like:
(lldb) memory read the_page
<Allocation tag 0x1 for range [0xfffff7ffa000, 0xfffff7ffa010)>
0xfffff7ffa000: 66 66 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ff..............
<Allocation tag 0x1 for range [0xfffff7ffa010, 0xfffff7ffa020)>
0xfffff7ffa010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................

Obviously there’s a lot of formatting freedom with the read command so
this won’t
always be as neat. It could be better to put the tags in the lines like:
0xfffff7ffa000 (tag 0x1): 66 66 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ff..............

Then if the lines are <16 bytes each you can repeat the tag in the next line.
Or for >16 bytes do “(tag 0x1, 0x2)”. This needs some experimentation, it could
get very confusing if we’re showing the same tag next to two ranges and it looks
like two separate tags. For example here we’re showing the same tag twice:

0xfffff7ffa000 (tag 0x1): 66 66 00 00 00 00 00 00 ff......
0xfffff7ffa008 (tag 0x1): 00 00 00 00 00 00 00 00 ........

### memory write

Will allow writes where the tags are mismatched.

It will print warnings for granules where the tags do not match. Even
if we assume
we’re writing a lot of data, if the program is MTE enabled then most of the time
tags will match. So it’ll only be noise in rare situations. A setting will be
added to disable them if needed.

lldb will read ahead for the tags. So for a write of 64 bytes we read 4 tags,
do the write then warn about any granules that didn’t match.

(lldb) memory write the_page 99
(lldb) memory write mismtached_ptr 99
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb000, 0xfffff7ffb010)
(lldb) memory write mismatched_ptr <17 bytes of data>
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb000, 0xfffff7ffb010)
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb010, 0xfffff7ffb020)

Hopefully “Warning” is enough to indicate that the write was still
done despite the mismatch.

## New Settings

Like the commands these settings will be present/visible in help even when MTE
is not available. The category name will be “memory-tagging”.

* assume-tagging-enabled - When handling logical tags in pointers assume that
  the memory they point to is MTE enabled. This allows you to debug/test things
  such as Clang’s stack tagging that are not handled by the kernel.
(default False)
* warn-on-write-tag-mismatch - Print warnings for each mismatching granule when
  writing with “memory write”. (default True)
* show-tags-in-read - Show tags in “memory read” output. (default True)

## New Registers

MTE adds 1 new register to the ptrace interface, which is the TAGGED_ADDR_CTRL
register. User programs use this same register via prctl to enable MTE.

It contains:
* A 16 bit include mask for tag generation. So with 0xFFFF you only get tags of
  0, with 0xFFFE you would get tags of 0 or 1, etc.
  (the hardware register GCR_EL1 actually has the opposite, an exclude mask)
* 1 bit to say whether tagged addresses are enabled at all
* 1 bit to set the fault mode for mismatched tags. This can be none
(ignore failures),
  asynchronous or synchronous.

So assuming we’re ok with pseudo registers like this being available via
“register read/write” it’ll be added to those. Probably under a “MTE Registers:”
or perhaps “Control Registers:” category.
(the latter could include future config regs such as pointer auth settings).

I say assuming because the current set are all what you’d call
hardware registers.
(though SVE might change this I’m not sure)

In “New Commands” I’ve also sketched out a command to read and pretty print the
register. Since I think most of the value will come from double
checking that you
passed the right flags to prctl, rather than modifying it on the fly.
(which could be done manually with “register write” if you really wanted to)

## SIGSEGV Handling

MTE faults raise a SIGSEV with a specific si_code for synchronous or
asynchronous.
The former includes the address where the fault happened. So this will fit into
the existing handlers quite easily.

(lldb) run
<...>
Process 19648 stopped
* thread #1, name = 'main', stop reason = signal SIGSEGV: Asynchronous
tag check fault
(lldb) run
<...>
Process 19648 stopped
* thread #1, name = 'main', stop reason = signal SIGSEGV: Synchronous
tag check fault (fault address: 0x100000000, allocation tag: 0x1)

Showing the allocation tag here is a nice to have, making an extra call for this
one fault might be awkward. You’d want to look at the logical tag for
the pointer
that caused the fault, so “mtag check <ptr>” gives you both regardless.

Note that the fault address does not include the logical tag used to access it.
I think we could show the logical tag assuming lldb knows what the destination
register of the faulting instruction is. I haven’t done the research here so I’m
not proposing that we should do it for this round of support.

## Corefiles

The format of corefiles for MTE is currently undecided so there is nothing to
mention here yet. Obviously we want to use the new commands to work with them
once they’re available. Discussion on that design will start shortly.

## Remote Protocol Changes

Note: some of lldb-server’s interpretation of packed tags is also
described in the “mtag setatag” section above.

### Extending qMemoryRegion

qMemoryRegion currently gives us the start/size/permissions and name
of a mapping.
For MTE we need to view the VmFlags line of the /proc/.../smaps file.
This contains
the “mt” flag, showing MTE was enabled for that memory.

Example entry:
00400000-004f4000 r-xp 00000000 fc:00 6431901
  /bin/bash
Size: 976 kB
<...>
VmFlags: rd ex mr mw me dw sd

To do this we will add an optional “flags” tuple to the response packet.

flags:<flags>;
Contain the flags shown on the VmFlags line, encoded as ASCII text just like the
“name” field is. (spaces remain as the delimiter)

This tuple will be optional because Linux kernels before 3.10 do not
have this file.
(also “flags” not “vmflags” to not be Linux specific)

### qSupported feature

The name will be “memory-tagging” to align with the GDB implementation. If this
feature is supported by the server it means it understands the new
packets and the target supports MTE.

### qMemTags (new)

Used to read memory tags from the target. (lldb-server will use
PTRACE_PEEKMTETAGS to do so)

(the “addr,length” format is derived from the existing m/M packets for
read/write memory)
qMemTags:addr,length:type

* addr - big endian hex address of the start of the range to read from.
  (the ptrace interface will take care of rounding this down to the
nearest granule)
* length - big endian hex number of bytes of memory to read tags from.
  This will be interpreted by the server to decide how many tags to return.
* type - a signed int indicating the type of tags being sent. This will just be
  one value at this time, meaning MTE, but leaves room for future
multi tag type systems.

Note: The length is interpreted by the server so the packet spec
doesn’t tell you
how you should do that. For AArch64 MTE lldb-server will be rounding up to the
nearest granule then returning 1 tag per. So 24 bytes becomes 32,
meaning 2 tags.

The reply is either:
* “mXX...” - (literal ‘m’) where XX is the hex encoded bytes of the tags read.
  (one tag per byte)
* “E nn” - An error code if one occurred. This will only be ‘01’ for
the time being.
  (it may prove useful to pass the ptrace error numbers through here
but it’s not
  needed for the current implementation)
* Empty reply - meaning the server doesn’t support this packet
  (in case the client didn’t pre-check this)

Note: The ‘m’ to start the tag data is present to support potential multi part
replies, where the last part would have ‘l’ instead.

Example exchange, reading the tags for the next 24 bytes of memory:
$qMemTags:CAFEFOOD,18:1#<checksum>
$m0E0F#<checksum>

### QMemTags (new)

Write memory tags to the target. (lldb-server will use
PTRACE_POKEMTETAGS to do this)

QMemTags:address,length:type:tags

* address - big endian hex address of the start of the range to write to.
  (which the ptrace interface will align for us)
* length - big endian hex length in bytes of the range to be written
to (see note)
* type - signed int indicating the tag type. For now there will only
be one value,
  which means MTE.
* tags - hex encoded bytes of the tags to be written (one tag per
byte/per 2 hex chars)

Note: The length does not have to match the number of tags given. If it is more
than the given tags can cover, the tags are taken as a pattern to apply.
Examples: (remember 1 tag covers 1 granule/16 bytes)

Write 0 tags to 16 bytes -> Error, must have at least one tag to use
Write 1 tag to 0 bytes -> round up to next granule making 16, so write
tag to 1st granule
Write 1 tag to 16 bytes -> writes 1st tag to 1st granule
Write 1 tag to 32 bytes -> writes 1st tag to the 1st and 2nd granule
Write 2 tags to 16 bytes -> writes the 1st tag to the 1st granule, 2nd
tag is unused

In this way you can do bulk operations like clear all tags or stripe
them throughout some range.

Example packet, writing the tags for the next 24 bytes of memory.
Setting granule
1 to 0xE and the next to 0xF:
$QMemTags:CAFEFOOD,18:1:0E0F#<checksum>

## Toolchain Requirements

We’re just using ptrace interfaces for this, we do not need tools capable of
assembling MTE instructions to be able to build. So there’ll be another header
in source/Plugins/Process/Linux/ containing the ptrace defines.

For some of the testing we will need an MTE toolchain to compile the
test programs,
same for corefiles.

## Testing

As much as possible will be done without needing an MTE system. Tests that need
an actual memory tagging enabled system will be tested using QEMU system mode
emulation. A document will be added to lldb’s documentation describing
how to run
the tests. (or added to the SVE testing docs)

## Interaction with Pointer Authentication

Armv8.3-a Pointer Authentication (PAC) also uses the upper bits of a pointer to
store metadata. PAC and MTE can be enabled in the same system and will share
those bits.
(ARM ARMv8 “Supported PAC field and relation to the use of address tagging”)

The position of the MTE tag does not change when PAC is enabled, so commands
do not need to check this first.

I think given the difference between the two schemes they should have
separate commands.
(MTE being a bitfield, PAC involving keys stored elsewhere)
Generic features like reporting mismatched tags/keys when reading memory could
apply to both so settings regarding that could be named generically.

<end doc>

Hi David, thanks for the great writeup. I hadn't been following the gdb MTE support.

This all looks reasonable to me. A few quick thoughts --

The initial idea of commands like "memory showptrtag", "memory showtag", "memory checktag" - it might be better to put all of these under "memory tag ...", similar to how "breakpoint command ..." works.

It makes sense to have lldb read/write the control pseudo-register as if it were a normal reg, in its own register grouping. You mentioned that you had some thoughts about how to make it more readable to users - I know this is something Greg has been hoping to do / see done at some point, for control registers where we could annotate the registers a lot better. I noticed that qemu for x86 provides exactly this kind of annotation information in its register target.xml definitions (v. lldb/test/API/functionalities/gdb_remote_client/TestRegDefinitionInParts.py ) but I don't THINK we do anything with these annotations today. Not at all essential to this work, but just noting that this is something we all would like to see better support for.

As for annotating the reason the program stopped on an MTE exception, Ismail was working on something similar in the past - although as you note, the really cool thing would be decoding the faulting instruction to understand what target register was responsible for the fault (and maybe even working back via the debug info to figure out what user-level variable it was??) to annotate it, which is something we're not doing anywhere right now. There was a little proof-of-concept thing that Sean Callanan did years ago "frame diagnose" which would try to annotate to the user in high-level source terms why a fault happened, but I think it was using some string matching of x86 instructions to figure out what happened. :slight_smile:

We're overdue to upstream the PAC support for lldb that we're using, it's dependent on some other work being upstreamed that hasn't been done yet, but the general scheme involves querying the lldb-server / debugserver / corefile to get the number of bits used for virtual addressing, and then it just stomps on all the other high bits when trying to dereference values. If you do 'register read' of a function pointer, we show the actual value with PAC bits, then we strip the PAC bits off and if it resolves to a symbol, we print the stripped value and symbol that we're pointing to. It seems similar to what MTE will need -- if you have a variable pointing to heap using MTE, and you do `x/g var`, lldb should strip off the MTE bits before sending the address to read to lldb-server. The goal with the PAC UI design is to never hide the PAC details from the user, but to additionally show the PAC-less address when we're sure that it's an address in memory. Tougher to do that with MTE because we'll never be pointing to a symbol, it will be heap or stack.

J

Hi Jason,

I wanted to bring this to your attention that we are also working on pointer authentication support. We have so far only done register context changes to allow for enabling/disabling pointer authentication features and read/write pauth Cmask/Dmask registers when available. I am currently investigating unwinder support which means any further implementation from my side will be an overlap with what you guys have done already. There can also be design conflicts and I would really appreciate it if we can come on some common ground regarding upstreaming of Apple’s downstream pointer authentication patches. We may collaborate and upstream unwinder support.

Thanks!

Hi Omair, yea I need to start working with the upstream sources and you to coordinate this. My current implementation in the apple branch was something I've revised over the last two years, and further revision will be needed as we integrate it in the llvm.org source base.

My general design is that the Process object will keep track of the # of bits used for virtual addresses. For instance, with a gdb remote serial protocol connection, in the qHostInfo packet, the addressing_bits: key-value pair is sent with the number of bits used for addressing. Similarly, we use an LC_NOTE (similar to an ELF NOTE) to get the number of addressing bits that were in use for the corefile. These #'s are derived from TCR_EL0.T0SZ or TCR_EL1.T0SZ, where those numbers are the opposite of what we want. e.g. T0SZ might have a value of 25, which means that 39 bits are used for virtual addressing, 64-25=39. So at Process setup, we get the number of bits used for addressing by Some Appropriate Mechanism.

I have an ABI method to strip off the PAC bits. This is a part of my design that needs a little more consideration -- I have a single method for getting a virtual address, and I have calls to it across the codebase, every time we know something *must* be an address, and it likely has authentication bits. Inside lldb, we're using the same ABI plugin for both PAC and non-PAC, because setting the high bits is harmless when we know a given bit cannot be used in addressing.

The problem is that different ABIs have different things signed in memory, so we'll need different calls into the ABI method to get the virtual address. For instance, in any ABI the lr is surely signed before it is spilled to stack on function entry - but is fp signed before being spilled? Should we have a single ABI::GetAsVirtualAddress that we put in every location that any ABI uses, or should we have a ABI::GetSpilledLRAsVirtualAddress, ABI::GetSpilledFPAsVirtualAddress, ABI::GetFunctionPointerAsVirtualAddress, ABI::GetVTablePointerAsVirtualAddress, and so on? The fact that passing a valid address through ABI::GetAsVirtualAddress (or whatever we call it) is *harmless* when we know it is addressable memory, makes me lean towards having generic lldb pass the superset of what all ABIs need through this single method.

You mentioned unwinding - a quick peek at our local sources, I have four places in RegisterContextLLDB.cpp (now RegisterContextUnwind.cpp) which needed to pass addresses through this ABI method.

There are user interface issues to consider. For instance, when you do "register read", lldb will print a register value, and if it resolves to a symbol, lldb will print the symbol name. With PAC bits, I print the actual register value (with PAC bits), strip the PAC bits, and if this value resolves to a symbolcontext, then I print the address-sans-PAC and the symbol name. There are probably a half dozen other places where I had to do the same thing --- the goal is to always show the user the actual register/pointer value with its PAC bits, and add information to that if we're confident that it's actually an virtual address.

Developers can add type qualifiers to their own structs to indicate that pointers should use pointer authentication. We have patches to represent this in the DWARF, and lldb reads those, so we know to strip the pac bits off of non-ABI objects that are using pointer auth - there are some changes to the Value object for this, and such. For SB API developers, we added a SBValue::GetValueAsAddress method (similar to SBValue::GetValueAsUnsigned, etc) so that people coding at the SB API layer can do the same thing lldb does -- show the value with PAC bits (GetValueAsUnsigned) and as an actual address (GetValueAsAddress) if they want to do that.

There's some additional changes for jitting expressions, and to be honest it's been ages since I've looked at that code so I can't speak on it very authoritatively without re-reading a bunch (and I authored very little of it).

A good place to start IMO, is the base idea of Process being the knower of the # of addressing bits, and the ABI being the knower of how to strip PAC bits off. I chose the ABI for this method because this *is* ABI, but given my ideas about an all-encompassing ABI::GetAsVirtualAddress that different parts of lldb pass things that can be signed in every ABI, maybe it doesn't even make sense to bother putting it in the ABI, it could go in Process and only strip off bits if the # of virtual addressing bits has been set.

Hm, another approach would be to have an ABI method that takes an address (with pac bits), an enumerated value describing the type of address it is, and then the ABI method would know whether to strip pac bits or not. the linux Aarch64 armv8.3 ABI might know it needs to strip PAC bits from a spilled fp, whereas the darwin arm64e ABI does not, or whatever. But I come back to the point that, when we're talking about PAC authentication bits here, if we know the # of addressible bits, and we know this value is an address in memory, it's harmless to strip, so having everyone pass the spilled FP value through the ABI::GetAsVirtualAddress method is fine if any ABI requires it. I may have ARMv8.3 too much in my mind here.

Hi all,

What follows is my proposal for supporting AArch64’s memory tagging
extension in LLDB. I think the link in the first paragraph is a good
introduction if you haven’t come across memory tagging before.

I’ve also put the document in a Google Doc if that’s easier for you to
read: https://docs.google.com/document/d/13oRtTujCrWOS_2RSciYoaBPNPgxIvTF2qyOfhhUTj1U/edit?usp=sharing
(please keep comments to this list though)

Any and all comments welcome. Particularly I would like opinions on
the naming of the commands, as this extension is AArch64 specific but
the concept of memory tagging itself is not.
(I’ve added some people on Cc who might have particular interest)

Thanks,
David Spickett.

RFC: AArch64 Linux Memory Tagging Support for LLDB

What is memory tagging?

Memory tagging is an extension added in the Armv8.5-a architecture for AArch64.
It allows tagging pointers and storing those tags so that hardware can validate
that a pointer matches the memory address it is trying to access. These paired
tags are stored in the upper bits of the pointer (the “logical” tag) and in
special memory in hardware (the “allocation” tag). Each tag is 4 bits in size.

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety

Definitions

  • memtag - This is the clang name for the extension as in
    “-march=armv8.5-a+memtag”
  • mte - An alternative name for mmtag, also the llvm backend name for
    the extension.
    This document may use memtag/memory tagging/MTE at times, they mean
    the same thing.
  • logical tag - The tag stored inside a pointer variable (accessible
    via normal shift and mask)
  • allocation tag - The tag stored in tag memory (which the hardware provides)
    for a particular tag granule
  • tag granule - The amount of memory that a single tag applies to,
    which is 16 bytes.

Existing Tool Support

New lldb features

Assuming your software is acting correctly, memory tagging can “just work”
without debugger support. This assumes the compiler/toolchain/user are
always correct.

For when that isn’t the case we want to be able to:

  • Read/write the logical tags in a pointer
  • Read/write the allocation tags assigned to a given area of memory
  • Test whether the logical tag in a pointer matches the allocation tag of the
    memory it refers to
  • Read/write memory even when tags are mismatched

The most obvious use case for this is working through issues where bugs in the
toolchain don’t generate correct code. On the other hand there’s a good case for
deliberately messing with pointers in your code to prove that such protection
actually works.

Note: potential extensions to scripting such as tags as attributes of values and
such are not being proposed here. Of course the new commands will be
added in the
standard ways so you can use those.

New Commands

Command Availability

Note: commands will be listed in tab completion and help regardless of
these checks

  • The remote server must support memory tagging packets. lldb will send/check
    for the “memory-tagging” feature in the qSupported packet. (this
    name aligns with gdb)
  • The process must have MTE available. We check HWCAP2_MTE for this.
  • The process must have enabled tagged addressing using prctl
    (see “New Registers” for details)
  • The address given must be in a range that has MTE enabled, since you can mmap
    with or without MTE. (this information is in /proc/…/smaps)

Interaction With Clang’s Stack Tagging

We’re relying on the kernel to tell us if MTE is enabled, so stack tagging will
not be visible to the debugger this way.
(https://github.com/google/sanitizers/wiki/Stack-instrumentation-with-ARM-Memory-Tagging-Extension-(MTE))

E.g. {int x; use(&x); } where x is void x(int* ptr);
“ptr” will have a memory tag but the kernel won’t know this.

To work around this a setting will be added to tell lldb to assume that MTE is
enabled, so that you can at least see the logical tags of a pointer.
(see “New Settings”)

General Properties/Errors

  • must resolve to some value that can be handled as an

address by lldb. (though it need not be a pointer specifically)

  • Tags will be printed in hexadecimal to reflect the fact that they are a 4 bit
    field. (and since tags are randomly generated, ordering is unlikely
    to be a concern)
  • Packed tags will be 1 tag per byte (matches what ptrace expects)
  • Addresses will be rounded down to the nearest granule (not always by lldb
    itself but what the user sees will look like this)
  • Ranges are rounded up to a whole number of granules
  • It is an error to use a command on an address that does not have MTE enabled.
    (with the exception of “mtag check”)

Commands

Avoiding Architecture Specific Naming

One problem you might see with the commands below is that they use l/a for
logical/allocation tags. These names are specific to MTE, for instance SPARC’s
ADI talks about “versions” instead. This limits the reuse of these
commands in the future.
(https://sourceware.org/gdb/current/onlinedocs/gdb/Sparc64.html#ADI-Support)

Instead we could first put them under “memory”, then merge the a/l tag commands
into “memory showtag” and “memory settag” (check → checktag,
getconfig → tagconfig).
Which avoids the arch specific names, though the output will still be.

(lldb) memory showtag
: logical 0x1 allocation: 0x1 0x2 0x3 …
(lldb) memory settag

Length and allocation tags would be optional. We could assume that if
we only get
the logical tag arg, we should set both kinds of tag. This accommodates future
systems where there is only one type of tag, or you can only set them
all at once.

Whatever way you do it, there’s some kind of Arch dependent behaviour.

Another option would be to call them the “pointer tag” and the “memory tag”.
(which lends itself to being “memory tag/ptag” not “mtag mtag/ptag”
which is just confusing)

(lldb) memory showptrtag
(lldb) memory showtag
(lldb) memory checktag

This makes the most sense to me and avoids having variable numbers of arguments
to commands.

mtag showltag

Show the logical tag contained in the address given.

(lldb) mtag showltag a_ptr
0xF

Error conditions:

  • As described above

mtag setltag

Set the logical tag of the variable that resolves to, to
the value resolves to.

(lldb) mtag setltag a_ptr 0xE

Error conditions:

  • Address variable is not writable, e.g ptr+10 we can set a new tag but have
    nowhere to write it back to.
  • Tag value is out of 0x0 to 0xF range. (this limit is specific to AArch64)

mtag showatag

Show the allocation tag(s) associated with the granule of memory that

points to. (this is reading target memory so the work will be done in lldb-server)

will default to 1 granule, otherwise you can provide a value in bytes
which will be rounded up to a whole number of granules. E.g 28 bytes becomes 32
bytes which is two granules so two tags.
(note that length of 0 also becomes 1 granule)

(lldb) mtag showatag a_ptr
[0xfffff7ffa000, 0xfffff7ffa010) : 0xE
0xE
(lldb) mtag showatag a_ptr 28
[0xfffff7ffa000, 0xfffff7ffa010) : 0xE
[0xfffff7ffa010, 0xfffff7ffa020) : 0xF

Error conditions:

  • General failure to read tag memory on the target (a ptrace failure)
  • Failure to read tags because MTE is not enabled
  • Given is less than zero

mtag setatag <tags…>

Set the allocation tags of the memory in range to

+ (where length is rounded up to a whole number of granules, meaning length <16 = 1 granule) to the tags in .

Where <tags…> is one or more tag arguments either in hex or decimal. Once these
are validated they will be each packed with 1 byte per tag in the data
sent to lldb-server.

Note: this is a break from the current gdb design that has the user type the raw
bytes. For example:
(gdb) mtag setatag a_ptr 32 040F

This does make the command more flexible as validation is done server side but
we’re doing some validation client side for logical tags anyway. The
question is,
is this added convenience enough to break with gdb?
(though if we go with the alternate “memory …” naming scheme proposed
above, we might as well)

In the example below we’re giving granule 1 at a_ptr a tag of 0x4 and granule 2
at a_ptr+16 a tag of 0xF. The second example sets the tag of the
granule at a_ptr to 0x5.

(lldb) mtag setatag a_ptr 32 0x4 15
(lldb) mtag setatag a_ptr 1 5

In the case that the number of tags given is not enough to cover the
memory range,
lldb-server will keep repeating the set until it does. Meaning a set of 2 tags
would be repeated once to cover 4 granules. A set of 3 tags would be
written once
with the first tag used again for the 4th granule.

Error conditions:

  • Length is not a valid number or is less than 0
  • One or more tags are out of the valid range of 0-0xF

mtag check

Check that the logical tag in matches the allocation tag
set for the granule it points to.

(lldb) mtag check a_ptr
Failed: logical tag 0x1 does not match allocation tag 0x2
(lldb) mtag check non_mte_ptr
Memory tagging is not enabled for address non_mte_ptr
(lldb) mtag check another_ptr
Passed: logical tag 0x1 matches allocation tag 0x1

Showing tags for a passed check seems redundant but I think it’s good to have as
a shortcut. That way you can use “mtag check” instead of “mtag showltag” then
“mtag showatag” if you want both tags.

Error conditions:

  • Standard handling

mtag getconfig

This command will read the TAGGED_ADDR_CTRL register (see “New Register”) and
pretty print its values. It’s nice to have but certainly isn’t as good as being
able to pretty print a register in general. (which I don’t think is
possible right now)

(lldb) mtag getconfig
Tagged addressing: Enabled
Fault Mode: Synchronous
Included Tags: 0b1111000011110000
(lldb) mtag getconfig
Target process is not MTE enabled.

Formatting up for debate of course, the point is you don’t have to shift things
in your head just to sanity check the debugee’s usage.

Note: no “set” for this at this time as I think that’s going to be a
much rarer occurrence.

Modified Commands

memory region

Will use the extra information from the qMemoryRegionInfo packet to show the
VmFlags where possible. For example:

(lldb) memory region addr
[0x00007ffff7ed2000-0x00007ffff7fd2000) rw- /dev/zero (deleted)
flags: rd ex mr mw me dw sd mt

memory read

Will not check that logical and allocation tags match, allowing reads
regardless.
Since most of the time checking is not the user’s intent when doing a read and
even if it is, there’s “mtag check” for that.

It will show allocation tags for memory that is MTE enabled. This is
on by default
on the basis that some subset of memory will be MTE so if you’re working with it
then tags are probably relevant. (new setting added to control this)

In the ideal scenario this looks like:
(lldb) memory read the_page
<Allocation tag 0x1 for range [0xfffff7ffa000, 0xfffff7ffa010)>
0xfffff7ffa000: 66 66 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ff…
<Allocation tag 0x1 for range [0xfffff7ffa010, 0xfffff7ffa020)>
0xfffff7ffa010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Obviously there’s a lot of formatting freedom with the read command so
this won’t
always be as neat. It could be better to put the tags in the lines like:
0xfffff7ffa000 (tag 0x1): 66 66 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ff…

Then if the lines are <16 bytes each you can repeat the tag in the next line.
Or for >16 bytes do “(tag 0x1, 0x2)”. This needs some experimentation, it could
get very confusing if we’re showing the same tag next to two ranges and it looks
like two separate tags. For example here we’re showing the same tag twice:

0xfffff7ffa000 (tag 0x1): 66 66 00 00 00 00 00 00 ff…
0xfffff7ffa008 (tag 0x1): 00 00 00 00 00 00 00 00 …

memory write

Will allow writes where the tags are mismatched.

It will print warnings for granules where the tags do not match. Even
if we assume
we’re writing a lot of data, if the program is MTE enabled then most of the time
tags will match. So it’ll only be noise in rare situations. A setting will be
added to disable them if needed.

lldb will read ahead for the tags. So for a write of 64 bytes we read 4 tags,
do the write then warn about any granules that didn’t match.

(lldb) memory write the_page 99
(lldb) memory write mismtached_ptr 99
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb000, 0xfffff7ffb010)
(lldb) memory write mismatched_ptr <17 bytes of data>
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb000, 0xfffff7ffb010)
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb010, 0xfffff7ffb020)

Hopefully “Warning” is enough to indicate that the write was still
done despite the mismatch.

New Settings

Like the commands these settings will be present/visible in help even when MTE
is not available. The category name will be “memory-tagging”.

  • assume-tagging-enabled - When handling logical tags in pointers assume that
    the memory they point to is MTE enabled. This allows you to debug/test things
    such as Clang’s stack tagging that are not handled by the kernel.
    (default False)
  • warn-on-write-tag-mismatch - Print warnings for each mismatching granule when
    writing with “memory write”. (default True)
  • show-tags-in-read - Show tags in “memory read” output. (default True)

New Registers

MTE adds 1 new register to the ptrace interface, which is the TAGGED_ADDR_CTRL
register. User programs use this same register via prctl to enable MTE.

It contains:

  • A 16 bit include mask for tag generation. So with 0xFFFF you only get tags of
    0, with 0xFFFE you would get tags of 0 or 1, etc.
    (the hardware register GCR_EL1 actually has the opposite, an exclude mask)
  • 1 bit to say whether tagged addresses are enabled at all
  • 1 bit to set the fault mode for mismatched tags. This can be none
    (ignore failures),
    asynchronous or synchronous.

So assuming we’re ok with pseudo registers like this being available via
“register read/write” it’ll be added to those. Probably under a “MTE Registers:”
or perhaps “Control Registers:” category.
(the latter could include future config regs such as pointer auth settings).

I say assuming because the current set are all what you’d call
hardware registers.
(though SVE might change this I’m not sure)

In “New Commands” I’ve also sketched out a command to read and pretty print the
register. Since I think most of the value will come from double
checking that you
passed the right flags to prctl, rather than modifying it on the fly.
(which could be done manually with “register write” if you really wanted to)

SIGSEGV Handling

MTE faults raise a SIGSEV with a specific si_code for synchronous or
asynchronous.
The former includes the address where the fault happened. So this will fit into
the existing handlers quite easily.

(lldb) run
<…>
Process 19648 stopped

  • thread #1, name = ‘main’, stop reason = signal SIGSEGV: Asynchronous
    tag check fault
    (lldb) run
    <…>
    Process 19648 stopped
  • thread #1, name = ‘main’, stop reason = signal SIGSEGV: Synchronous
    tag check fault (fault address: 0x100000000, allocation tag: 0x1)

Showing the allocation tag here is a nice to have, making an extra call for this
one fault might be awkward. You’d want to look at the logical tag for
the pointer
that caused the fault, so “mtag check ” gives you both regardless.

Note that the fault address does not include the logical tag used to access it.
I think we could show the logical tag assuming lldb knows what the destination
register of the faulting instruction is. I haven’t done the research here so I’m
not proposing that we should do it for this round of support.

Note that the address tag is guaranteed to be present for tag check faults in the system register FAR_EL1, but the tag is currently stripped by the kernel before being stored in siginfo.si_addr. I am working on a kernel patch which will make this information available via siginfo, and once the tag becomes available from the kernel you shouldn’t need to decode the instruction.

Peter

The initial idea of commands like "memory showptrtag", "memory showtag", "memory checktag" - it might be better to put all of these under "memory tag ...", similar to how "breakpoint command ..." works.

Sounds good to me, I didn't know there was a 3 level command in there
already. The names get a bit redundant since "memory tag set" doesn't
tell you which one of the pair it's setting. So we could have "memory
tag setptrtag" "memory tag setmemorytag", or make "set" one command
with variable arguments:
Set logical tag: memory tag set <addr> <pointer tag>
Set logical and allocation: memory tag set <addr> <pointer tag>
<length> <tags...>
Set only allocation: memory tag set <addr> --only-memory <length> <tags...>
(which I think is a bit neater)

Where "pointer tag" and "memory tag" were the best generic names for
"logical" and "allocation" I came up with. (think of it like the
memory tag is attached to the memory, pointer tag is attached to a
pointer)
Also "memory tag check" can be removed since it's just "memory tag
show" with a warning on mismatch.

My general design is that the Process object will keep track of the # of bits used for virtual addresses.

I hadn't considered this issue thanks for bringing it up. Your scheme
seems reasonable to me. I see that "addressing_bits" is in the
upstream qHostInfo but only in the RNBRemote, does that mean that
upstream already uses this in some way? (presumably just for Apple
platforms?)

I am working on a kernel patch which will make this information available via siginfo, and once the tag becomes available from the kernel you shouldn't need to decode the instruction.

Great! I'll keep an eye on it.

Hi all, the first series of changes for MTE has been in review
phabricator for a while now. They have all bar one been approved by
Omair Javaid and we'd like to start landing them with a view towards
having MTE support for llvm 13. (there is another series waiting in
the wings for tag writing)

Not calling out anyone in particular who I might have added to any of
the reviews, we're all busy folk. I'm just aware that memory tagging
is an AArch64 Linux only feature right now and Omair and myself are
both Linaro so I want to give others a chance to comment on the
changes from a general lldb perspective.

You can the changes here:
https://reviews.llvm.org/D97281
https://reviews.llvm.org/D97282
https://reviews.llvm.org/D95601
https://reviews.llvm.org/D95602
https://reviews.llvm.org/D97285

Even if you are not familiar with memory tagging, if something
generally sticks out, let me know. If I don't receive any request for
changes in the next few days I'll start landing them.

If you want to peek at the changes that aren't in review yet you can
do so here: https://github.com/DavidSpickett/llvm-project/commits/mte_commands