Performance degradation on ARMv7 (cortex-a9)

Hi Bradley,

I was doing some performance analysis for ARMv7 (cortex-a9) and I noticed that one of my benchmarks degraded by 93%. I have tracked the regression down to the following commit by you:

commit 7c1b77248baaeafec5d6433c3d1da9a2e2b69595
Author: Bradley Smith
[ARM] Introduce subtarget features per ARM architecture.
This allows for accurate architecture targeting as well as removing
duplicate information (hardcoded feature strings) from MCTargetDesc.

I see that in lib/Target/ARM/ARM.td all the features have been removed from Proc definition (e.g.: ProcA9) and added to ProcessorModel definition (e.g.: ProcessorModel<“cortex-a9”).
But I find that the features from Proc are still being read and set in MCSubtargetInfo through the ARMFeatureKV table. So if the Proc is empty the corresponding feature is not being set.
In my case, if I add FeatureFP16 back to the ProcA9 definition in ARM.td I get back all the lost performance.

Could you please give me some insight on how, after your change, do the Proc features get correctly set in MCSubtargetInfo and other places which access Proc?

Thanks,
Mandeep

Hi,

The idea behind that change was to make ARM.td clearer, that is, adding architecture features to new architecture subtarget features, and to have the CPUs inherit from this. ProcA9 (and similar) from what I could tell were only being used for their enum value in making codegen decisions, hence I moved all of the features they inherit over to the actual CPUs for clarity, the idea being that all features a given target uses come from a combination of the architecture it inherits from and the target itself, not any intermediary features like ProcA9.

I’m not aware of any place where ProcA9 is getting used to get subtarget features like this, and after a quick look I still can’t find anything. Where exactly are you seeing ProcA9 being used to get features? Even so, the cortex-a9 processer model itself inherits FeatureFP16 now so I would expect it to use FP16, unless you’re not using cortex-a9 directly? (In which case all CPUs that used to inherit ProcA9 now need to inherit all of the features ProcA9 used to inherit as well as ProcA9, which is what I did in the change you mention).

Regards,

Bradley Smith

Thanks Bradley.

I see that the features set in ARM.td get written to the generated file /llvm/lib/Target/ARM/ARMGenSubtargetInfo.inc. Here the ProcA9 features appear in ARMFeatureKV table:

{ “a9”, “Cortex-A9 ARM processors”, { ARM::ProcA9 }, { ARM::FeatureFP16 } },

With your change, the features for ProcA9 in the above entry are empty. This ARMFeatureKV table is then read in MC/MCSubtargetInfo.cpp in the getFeatures() function.

Thanks,
Mandeep

I’m still not sure I follow why you think this is reading ProcA9? The getFeatures function will cause the parameter ‘CPU’ to be read from ARMFeatureKV, which in the case of Cortex-A9 will be the feature “cortex-a9” which maps to the ProcessorModel and contains FeatureFP16, not ‘a9’ which maps to ProcA9. Where is getFeatures getting called from with the parameter CPU = “a9”?

Regards,

Bradley Smith