llvm-mca for in-order CPUs (was Re: LLVM Weekly - #375, March 8th 2021)

jayfoad · March 9, 2021, 2:03pm

Thanks for doing this! I am very interested in using it for the AMDGPU
target. Have you given any thought to targets with
MicroOpBufferSize=1? I understand that these are also "in order". I
found that I could get some tests running with these changes:
https://reviews.llvm.org/differential/diff/329308/

But I am really shooting in the dark here. I don't have a good
understanding of the difference between MicroOpBufferSize=0 and 1, and
I am not even sure which setting is really best for AMDGPU.

Thanks,
Jay.

Andrew_Savonichev · March 9, 2021, 4:33pm

Hi Jay,

Jay Foad writes:

* The llvm-mca static performance analysis tool now support in-order CPUs such
as the Arm Cortex-A55. [d791695](rGd791695cb517).

Thanks for doing this! I am very interested in using it for the AMDGPU
target.

So far the feature was only tested for ARM in-order CPUs, so it will be
great if you can try it for the AMDGPU target!

Have you given any thought to targets with MicroOpBufferSize=1?
I understand that these are also "in order". I found that I could get
some tests running with these changes:
⚙ Diff View

But I am really shooting in the dark here. I don't have a good
understanding of the difference between MicroOpBufferSize=0 and 1, and
I am not even sure which setting is really best for AMDGPU.

Frankly, I don't know what is the difference between MicroOpBufferSize=0
and 1. We should probably treat them the same for MCA, so your changes
look good.

atrick · March 9, 2021, 5:54pm

We should really have some alias for MicroOpBufferSize=0/1. It’s too cryptic.

InOrder => MicroOpBufferSize=1
VLIW => MicroOpBufferSize=0

It only affects what instructions the scheduler puts in the ready queue. In VLIW-mode, the scheduler only considers instructions that can be scheduled in the current group. In InOrder mode, the scheduler can weigh the potential latency stall against other heuristics. I don’t think it’s relevant for MCA.

-Andy

// “0” means operations that are not ready in this cycle are not considered
// for scheduling (they go in the pending queue). Latency is paramount. This
// may be more efficient if many instructions are pending in a schedule.
//
// “1” means all instructions are considered for scheduling regardless of
// whether they are ready in this cycle. Latency still causes issue stalls,
// but we balance those stalls against other heuristics.
//
// “> 1” means the processor is out-of-order. This is a machine independent
// estimate of highly machine specific characteristics such as the register
// renaming pool and reorder buffer.

jayfoad · March 10, 2021, 4:17pm

Thanks. I found there is already an MCSchedModel::isOutOfOrder which
makes it slightly less cryptic. I've put a patch up at
https://reviews.llvm.org/D98356 to try to support MicroOpBufferSize=1
in llvm-mca as simply as possible.

Jay.

Topic		Replies	Views
Need Guidence in how can i add new target support in LLVM MCA LLVM Project gpu	2	153	January 8, 2024
[RFC][AArch64] Make -mcpu=generic schedule for an in-order core LLVM Dev List Archives	2	79	October 4, 2021
(no subject) LLVM Dev List Archives	25	100	March 8, 2017
[RFC] [ARM] v6m: Suggestions for a slightly different set of default optimizer settings. LLVM Dev List Archives	2	65	January 12, 2015
Hi Cache Miss and Branch Misprediction LLVM Dev List Archives	1	60	September 30, 2008

llvm-mca for in-order CPUs (was Re: LLVM Weekly - #375, March 8th 2021)

Related Topics