[PATCH] x86: disambiguate unqualified btr, bts

Please send patches to llvm-commits. Please include a testcase with
each patch. Please check that your patch actually works correctly
before sending it to the mailing list for review. (See
http://llvm.org/docs/DeveloperPolicy.html .)

-Eli

Also, please elaborate on why this is a good change. Because gas accepts it isn’t sufficient reason in and of itself.

-Jim

Eli Friedman wrote:

I've probably done something stupid; seems to build correctly, but
that's all I know. Also, tests are pending.

Please send patches to llvm-commits. Please include a testcase with
each patch. Please check that your patch actually works correctly
before sending it to the mailing list for review.

I know. I was asking for a sanity check before doing another
iteration. Never mind: I'll submit another iteration soon.

Jim Grosbach wrote:

Also, please elaborate on why this is a good change. Because gas accepts it
isn’t sufficient reason in and of itself.

That they're valid instructions isn't sufficient reason? Should I
additionally say that linux.git uses them?

I wrote:

Jim Grosbach wrote:

Also, please elaborate on why this is a good change. Because gas accepts it
isn’t sufficient reason in and of itself.

That they’re valid instructions isn’t sufficient reason? Should I
additionally say that linux.git uses them?

Is the diagnostic incorrect?

To say that another way, is the assembler correctly diagnosing a previously unnoticed problem in the project source code, or is the assembler not behaving correctly according the the documented Intel assembly mnemonics? If the former, the fix belongs in the project, not the assembler. If the latter, then we should absolutely fix the assembler. From your emails, it is unclear to me which is the case.

-Jim

Jim Grosbach wrote:

To say that another way, is the assembler correctly diagnosing a previously
unnoticed problem in the project source code, or is the assembler not
behaving correctly according the the documented Intel assembly mnemonics?

Where are the authoritative instruction set pages? If such a thing
were readily available, why are there gaps in the current
implementation? A quick Googling gets me [1], but I can't say it's
authoritative. What's important is that there certainly are
architectures where btr/bts are valid instructions, and they must be
supported. btr/bts are certainly not invalid instructions that we're
bending over backwards to support, because linux.git/gas works with
them.

[1]: http://web.itu.edu.tr/kesgin/mul06/intel/index.html

The correct answer here is "Vol 2a of the Intel Reference Manual
available off of the Intel website".

As far as whether or not we should support the instructions with a
lack of length mnemonic, I'm going to step back away from the
discussion. The Intel manuals support Intel syntax (which we also
support) which necessarily doesn't have length encodings for most
instructions, but AT&T syntax which gas uses (and llvm supports as
well) often has length specifiers for instructions.

-eric

Jim Grosbach wrote:

To say that another way, is the assembler correctly diagnosing a previously
unnoticed problem in the project source code, or is the assembler not
behaving correctly according the the documented Intel assembly mnemonics?

Where are the authoritative instruction set pages? If such a thing
were readily available, why are there gaps in the current
implementation? A quick Googling gets me 1, but I can’t say it’s
authoritative. What’s important is that there certainly are
architectures where btr/bts are valid instructions, and they must be
supported. btr/bts are certainly not invalid instructions that we’re
bending over backwards to support, because linux.git/gas works with
them.

_

The correct answer here is "Vol 2a of the Intel Reference Manual
available off of the Intel website”.

Yep, for Intel syntax that’s exactly right.

As far as whether or not we should support the instructions with a
lack of length mnemonic, I’m going to step back away from the
discussion. The Intel manuals support Intel syntax (which we also
support) which necessarily doesn’t have length encodings for most
instructions, but AT&T syntax which gas uses (and llvm supports as
well) often has length specifiers for instructions.

The length specifier is, as I understand it, required when the instruction references memory but is optional (and inferred from the registers) for the register variants.

The best reference I know of for the AT&T syntax is: http://docs.oracle.com/cd/E19253-01/817-5477/817-5477.pdf

That does raise a clarifying question here. Is the code you’re interested in using Intel or AT&T syntax?

Also note that the question isn’t whether we should support the btr/bts instructions. We absolutely must (and do). The question is whether we are properly handling the un-suffixed mnemonic form of the assembly syntax.

Perhaps you can help by clarifying via explicit example what code you believe should work but does that. Descriptions are great, but specifics are better once we’re this deep into the details.

-Jim

I’m not sure I’d use the documentation for the Solaris assembler as authoritative for AT&T syntax, but page 17 does say that if the suffix is omitted it defaults to long.

However, that isn’t my experience with gas which uses register operands to disambiguate, if possible (although I’m on a phone and can’t check right now).

The length specifier is, as I understand it, required when the instruction references memory but is optional (and inferred from the registers) for the register variants.

The best reference I know of for the AT&T syntax is: http://docs.oracle.com/cd/E19253-01/817-5477/817-5477.pdf

I’m not sure I’d use the documentation for the Solaris assembler as authoritative for AT&T syntax, but page 17 does say that if the suffix is omitted it defaults to long.

Yeah, me either. That’s part of why I’m asking for references. I’d love to know which ones other people use. Good docs for AT&T syntax are annoyingly hard to find.

However, that isn’t my experience with gas which uses register operands to disambiguate, if possible (although I’m on a phone and can’t check right now).

Yep, it’s the memory reference versions which IIUC require a suffix.

I want to make sure we do the right thing here and know why we’re doing it rather than just adding some aliases because it matches gas’ behavior.

-Jim

http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/arch/x86/include/asm/bitops.h#L68

Here is one example that I found. Are the inline assembly arguments ambiguous in size?

-Jevin

The reason it's the right thing to do is that the mem/imm forms of
btsw and btsl have exactly the same semantics.

-Eli

Jim Grosbach wrote:

That does raise a clarifying question here. Is the code you’re interested in
using Intel or AT&T syntax?

Also note that the question isn’t whether we should support the btr/bts
instructions. We absolutely must (and do). The question is whether we are
properly handling the un-suffixed mnemonic form of the assembly syntax.

Perhaps you can help by clarifying via explicit example what code you
believe should work but does that. Descriptions are great, but specifics are
better once we’re this deep into the details.

I don't know! This is the first time I'm hearing about differences
between Intel and AT&T syntax; I have absolutely no clue how btr/bts
_should_ be implemented, and the only specifics I have are real-world
examples: linux.git/gas. Does looking at bitops.h in the kernel tree
help?

For the record, I don't think matching linux.git/gas is a problem:
they're very authoritative pieces of software that have been around
for a _really_ long time. What matters to me is that real-world
software works as expected, even if it violates some transcendental
(but unimplemented) authoritative standard [1]. If other mainstream
assemblers handle btr/bts differently, I would see a cause for worry;
otherwise, no.

Eli Friedman wrote:

The reason it's the right thing to do is that the mem/imm forms of
btsw and btsl have exactly the same semantics.

Not sure I understand this.

[1]: Especially considering that we can't seem to find one.

For the record, I don't think matching linux.git/gas is a problem:
they're very authoritative pieces of software that have been around
for a _really_ long time.

That's a very, very weak argument. There are a lot of things Clang
rejects as errors by default that has been used in old code bases,
because GCC accepted it.

Eli Friedman wrote:
> The reason it's the right thing to do is that the mem/imm forms of
> btsw and btsl have exactly the same semantics.

Not sure I understand this.

There is no way to tell from the arguments whether bts should be btsw or
btsl. That's the ambiguity it is complaining about. Fixing it is
trivial. There are other cases with similar issues in the 8087 syntax
with size of the floating point operand.

Joerg

It would help us for sure to build the kernel and others.

[Please be friendly towards non-subscribers and always reply-to-all; I
found your message in the list archives]

Joerg Sonnenberger wrote:

> Eli Friedman wrote:
> > The reason it's the right thing to do is that the mem/imm forms of
> > btsw and btsl have exactly the same semantics.
>
> Not sure I understand this.

There is no way to tell from the arguments whether bts should be btsw or
btsl. That's the ambiguity it is complaining about. Fixing it is
trivial. There are other cases with similar issues in the 8087 syntax
with size of the floating point operand.

I see. Can you explain the results of this experiment to me?

Input #1:
  btrw $1, 0
  btr $1, 0
  btsw $1, 0
  bts $1, 0

Output #1 from gas:
   1 0000 660FBA34 btrw $1, 0
   1 25000000
   1 0001
   2 000a 0FBA3425 btr $1, 0
   2 00000000
   2 01
   3 0013 660FBA2C btsw $1, 0
   3 25000000
   3 0001
   4 001d 0FBA2C25 bts $1, 0
   4 00000000
   4 01

Input #2:
  btrl $1, 0
  btr $1, 0
  btsl $1, 0
  bts $1, 0

Output #2 from gas:
   1 0000 0FBA3425 btrl $1, 0
   1 00000000
   1 01
   2 0009 0FBA3425 btr $1, 0
   2 00000000
   2 01
   3 0012 0FBA2C25 btsl $1, 0
   3 00000000
   3 01
   4 001b 0FBA2C25 bts $1, 0
   4 00000000
   4 01
   5

bts{w} is defined as: ins i16mem:$src1, i16i8imm:$src2
while bts{l} is defined as: ins i32mem:$src1, i32i8imm:$src2

My disambiguation of bts is defined as: def : InstAlias<"btr $imm,
$mem", (BTR32mi8 i32mem:$mem, i32i8imm:$imm)>;

ie. I treat the first operand is an i32mem, and the second is as
i32i8imm, to disambiguate bts to btsl; exactly like bt disambiguates
to btl in the previous line (from 824a907):

def : InstAlias<"bt $imm, $mem", (BT32mi8 i32mem:$mem, i32i8imm:$imm)>;

What am I missing?

> For the record, I don't think matching linux.git/gas is a problem:
> they're very authoritative pieces of software that have been around
> for a _really_ long time.

That's a very, very weak argument. There are a lot of things Clang
rejects as errors by default that has been used in old code bases,
because GCC accepted it.

That was a personal opinion, and I've consistently demonstrated that I
have no experience/ in-depth knowledge of assemblers. I'm not sure
what to do myself, but this disambiguation seems to be the reasonable
(not claiming Correct or Incorrect) solution. My agenda is simple: I
don't like the outdated cruft that GNU as is, and I think a lot of
codebases can benefit from the transition to LLVM's assembler. That's
not going to happen by sitting around doing nothing.

Eli Friedman wrote:

The reason it's the right thing to do is that the mem/imm forms of
btsw and btsl have exactly the same semantics.

The Intel documentation implies that this is the case:

If the bit base operand specifies a memory location, it represents the address of the byte in memory that contains the bit base (bit 0 of the specified byte) of the bit string (see Figure 3-2). The offset operand then selects a bit position within the range −231 to 231 − 1 for a register offset and 0 to 31 for an immediate offset.

However, this doesn't seem to be true if the immediate value is greater than 15. See the attached imm.c which prints the results of bts m16,imm8 (btsw) and bts m32,imm8 (btsl) with the immediate in [0,63]. For btsw, only the least significant 4 bits of the immediate seem to be used whereas for btsl, only the least significant 5 bits seem to be used.

In contrast, bts m16,r16 (btsw) and bts m32,r32 (btsl) are identical for bit offset operands in [0,63] and are likely identical for [-2^{15}, 2^{15}-1], although I didn't actually check the others. See the attached reg.c which changes the immediate constraints to register constraints.

btr behaves similarly.

For the memory, immediate form without the suffix, it seems like the options are
1. If the immediate value is in [0,15], use btsl/btrl since it saves a byte, otherwise error;
2. Follow both gas's behavior and the Solaris assembler manual Jim Grosbach linked to which stated that unsuffixed instructions are assumed to be long and alias bts to btsl and btr to btrl; or
3. Always error even when there is no ambiguity.

I have no opinion on which option LLVM should follow.

imm.c (576 Bytes)

reg.c (578 Bytes)

Stephen Checkoway wrote:

[...]

Thanks for the absolutely splendid analysis!

For the memory, immediate form without the suffix, it seems like the options are
1. If the immediate value is in [0,15], use btsl/btrl since it saves a byte, otherwise error;
2. Follow both gas's behavior and the Solaris assembler manual Jim Grosbach linked to which stated that unsuffixed instructions are assumed to be long and alias bts to btsl and btr to btrl; or
3. Always error even when there is no ambiguity.

I have no opinion on which option LLVM should follow.

Okay, so while digging through the history of the linux.git tree, I
found this. The patch _replaces_ btrl instructions with btr
instructions, for seemingly good reason. What is your opinion on the
issue?

commit 1c54d77078056cde0f195b1a982cb681850efc08
Author: Jeremy Fitzhardinge <jeremy@goop.org>

The patch _replaces_ btrl instructions with btr
instructions, for seemingly good reason. What is your opinion on the
issue?

Mine is it would be a sympathetic reason if correct, but not good if
the instructions shouldn't exist in the first place. However:

(From the commit message):

    The inline assembly for the bit operations has been changed to remove
    explicit sizing hints on the instructions, so the assembler will pick
    the appropriate instruction forms depending on the architecture and
    the context.

It doesn't do this, as far as I can tell. I cannot make the unsuffixed
versions do anything except an 'l' access in either gas or gcc,
regardless of architecture, pointer size or anything else.

What the code actually does is produce "btsl $imm, addr" if 0 <= imm
<=31 (at compile-time) and "btsl %eax, addr" otherwise, which works
even for 64-bit types (by Stephen's analysis), but involves an extra
"movl $imm, %eax".

From the commit message, it seems like they *do* intend it to produce

"btslq $imm, addr" where possible and might well be open to fixing
this as a bug (that just happens to help Linux compile with Clang)
rather than purely for toolchain compatibility.

That said, it's not clear how to enable both forms to be generated
easily from inline asm, but I'm not an expert.

Tim.