[RFC] Intrinsic naming convention (words with dots)

Finkel_Hal_J · December 1, 2015, 10:26am

Hi everyone,

We seem to have allowed our documented target-independent intrinsics to acquire a somewhat-haphazard naming system, and I think we should standardize on one convention. All of the intrinsics have 'llvm.' as a prefix, and some also have some additional prefix 'llvm.dbg.', 'llvm.eh.', 'llvm.experimental.', etc., but after that we lose consistency. When there is just a single word (or acronym) everything is fine, but the way we join multiple words (or acronyms) falls into three categories:

1. No separator (e.g. @llvm.readcyclecounter)
2. Using '.' as a separator (e.g. @llvm.sadd.with.overflow)
3. Using '_' as a separator (e.g. @llvm.read_register)

I propose that we standardize on (2) -- words with dots -- as it seems to have a plurality of more-recent intrinsics (and I think it is easy to read, as is (3)). Thoughts?

Although this is somewhat subjective, here's our current set of intrinsics with multiple words (or acronyms) by these categories. I'm excluding here externally-defined terms (e.g. llvm.va_start):

No separators (except for the initial namespace prefix):

@llvm.gcroot
@llvm.gcread
@llvm.gcwrite

@llvm.experimental.stackmap
@llvm.experimental.patchpoint

@llvm.experimental.gc.statepoint

@llvm.returnaddress
@llvm.frameaddress

@llvm.localescape
@llvm.localrecover

@llvm.stacksave
@llvm.stackrestore

@llvm.pcmarker
@llvm.readcyclecounter

@llvm.bitreverse

@llvm.eh.begincatch
@llvm.eh.endcatch

@llvm.eh.padparam

@llvm.stackprotector
@llvm.stackprotectorcheck
@llvm.objectsize

@llvm.donothing

Words with dots:

@llvm.sadd.with.overflow
@llvm.uadd.with.overflow
@llvm.ssub.with.overflow
@llvm.usub.with.overflow
@llvm.smul.with.overflow
@llvm.umul.with.overflow

@llvm.convert.to.fp16
@llvm.convert.from.fp16

@llvm.eh.typeid.for

@llvm.init.trampoline
@llvm.adjust.trampoline

@llvm.masked.load
@llvm.masked.store

@llvm.masked.gather
@llvm.masked.scatter

@llvm.lifetime.start
@llvm.lifetime.end

@llvm.invariant.start
@llvm.invariant.end
@llvm.invariant.group.barrier

@llvm.var.annotation
@llvm.ptr.annotation

@llvm.bitset.test

Words with underscores (except for the initial namespace prefix):

@llvm.read_register
@llvm.write_register

@llvm.clear_cache

@llvm.instrprof_increment
@llvm.instrprof_value_profile

Thanks again,
Hal

echristo · December 1, 2015, 5:03pm

SGTM.

Thanks!

-eric

Krzysztof_Parzyszek · December 1, 2015, 5:15pm

How about using dots to separate "contexts" and underscores to separate words, e.g.

llvm.gc.* --stuff related to GC
llvm.gc.read
llvm.gc.do_something_else

-Krzysztof

Finkel_Hal_J · December 1, 2015, 5:23pm

From: "Eric Christopher" <echristo@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>, "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Tuesday, December 1, 2015 11:03:05 AM
Subject: Re: [llvm-dev] [RFC] Intrinsic naming convention (words with dots)

SGTM.

Thanks!

Follow-up question: Once we decide on a convention, should we:

1. Just document it, leave existing things as-is, but make all new intrinsics comply with the convention.

2. Update all existing intrinsic names to follow the naming convention (with auto-upgrade for bitcode as necessary).

3. If we do (2), does that constitute an ABI break at the C level unless special provisions are made?

Thanks again,
Hal

bogner · December 1, 2015, 5:33pm

Krzysztof Parzyszek via llvm-dev <llvm-dev@lists.llvm.org> writes:

mjacob · December 1, 2015, 5:42pm

That's what I thought also when reading the proposal. I always thought of dots in intrinsic names as namespace separators, which doesn't always conform with actual usage.

-Manuel

David_Chisnall3 · December 1, 2015, 6:34pm

My concern with this proposal is that the process that generates the C++ enum values transforms dots into underscores. Mixing dots and underscores in the IR seems really bad because there are then multiple possible IR values for any given C++ value. I’d much prefer that we remove the existing users to underscores and make it explicit that dot in IR means underscore in C++.

David

Bruce_Hoult · December 1, 2015, 7:19pm

So then that leaves camelCase for words in a context…

dexonsmith · December 1, 2015, 10:55pm

This idea SGTM, using `.` as a namespace (and otherwise using `_`).

Chris_Lattner · December 2, 2015, 5:16am

I’m fine with “words with dots” or “dots are namespaces and underscores separate parts of words”. If you’re really on board with doing the autoupgrade logic from the old names, then I slightly prefer dots for namespaces.

-Chris

preames · December 2, 2015, 11:44pm

This proposal - dots as namespaces, underscore for words - would be my preferred scheme, but I really don't have much of a strong preference. Any reasonable scheme which is documented and consistent works for me.

Philip

echristo · December 2, 2015, 11:52pm

Yep

And if it autoupgrades then it’s fairly easy to change anyhow.

-eric

John_Criswell4 · December 3, 2015, 1:35am

Dear Hal,

The current rule for an intrinsic, IIRC, is llvm.<str> where <str> is some arbitrary name that is allowed within an LLVM function name.

While I can understand the desire for consistency, I think what you suggest is a purely aesthetic change with no real value. If you want to spend your time on aesthetics, that's fine with me, but you're introducing changes to the LLVM assembler, disassembler, and documentation to do it. It may also cause issues with in-tree and out-of-tree test suites that grep for intrinsic names in LLVM assembly output (not sure how many tests do that, but it's possible).

Personally, I wouldn't spend my time on it, but that's just me.

FWIW,

John Criswell

Topic		Replies	Views
RFC: Disallow intrinsics with name that share prefix with an overloaded intrinsic IR & Optimizations	8	148	October 15, 2024
[RFC] Compress Intrinsic Name Table IR & Optimizations llvm	50	757	November 4, 2024
[PATCH] D17497: Support arbitrary address space for intrinsics LLVM Dev List Archives	9	84	March 9, 2016
Inconsistent naming of SSE intrinsics? LLVM Dev List Archives	1	69	June 27, 2012
[RFC] Pretty printing for LLVM Intrinsic arguments (Short) IR & Optimizations llvm	16	291	November 7, 2024

[RFC] Intrinsic naming convention (words with dots)

Related topics