Memrefs and maps for tiling

Hi @AlexEichenberger, @imaihal an update has been made to memref map layout normalization that deals with the ReturnOps. Please see here:

Hi, @bondhugula, @abhishek.varma,
Our test case including map in dialect operations is now successfully normalized by your and @AlexEichenberger’s patch ( Thanks for your help!

We have another requirement about normalizing affine_map with dynamic dimension, as in the code below. (I just changed the dimension of test code here

func @permute() {
  %c64 = constant 64 : index
  %A = alloc(%c64) : memref<?x256xf32, affine_map<(d0, d1) -> (d1, d0)>>
  affine.for %i = 0 to %c64 {
    affine.for %j = 0 to 256 {
      %1 = affine.load %A[%i, %j] : memref<?x256xf32, affine_map<(d0, d1) -> (d1, d0)>>
      "prevent.dce"(%1) : (f32) -> ()
  dealloc %A : memref<?x256xf32, affine_map<(d0, d1) -> (d1, d0)>>

Currently this is not normalized, but we found you wrote it as TODO in comments.(
Do you plan to support it?

Hi @imaihal, this isn’t really in our immediate TODO list. Will be happy to help review it if someone takes it up.

Hi @bondhugula, do you think that handling a case where the dynamic dimension is trivially mapped would be an easier stepping stone? See d0 mapping to ? below:

 memref<?x256xf32, affine_map<(d0, d1) -> (d0, d1 floordiv 32, d1 mod 32)>>

I don’t think it’ll make a big difference or any difference at all. It could be done in one shot for the general case I think. An extra “symbol” column would be needed in the constraint system for each dynamic dim, and the upper bound obtained subsequently would be an affine function potentially involving symbols (as opposed to just a constant as was the case for a static memref). It can then be used to construct the allocation for the new memref type. The access replacement logic remains unchanged, right?

@bondhugula, Thank you for always answering. I have another question about normalizing (static) memrefs. Could you give me any comments or suggestions?

I would like to normalize following example, but I couldn’t. However, when I removed spv.EntryPoint "GLCompute" @empty, I can normalize the memrefs. Is it possible to ignore the line or any other suggestions? @AlexEichenberger suggested me that the normalization may not be able to handle code outside of the func.

(I looked for similar example with ours in llvm-project/mlir/test, and I created this example from misc-ops-to-llvm.mlir)

- Example (not normalized)

$ cat misc-ops-to-llvm_entrypoint.mlir
#map0 = affine_map<(d0, d1) -> (d0 floordiv 32, d1 floordiv 64, d0 mod 32, d1 mod 64)>

module {
  func @empty() {
    %0 = alloc() : memref<10x10xf32, #map0>
  spv.EntryPoint "GLCompute" @empty**

I saw following error messages by mlir-opt --normalize-memrefs <this code>

mlir-opt: llvm-project/llvm/include/llvm/Support/Casting.h:269: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = mlir::CallOp; Y = mlir::Operation; typename llvm::cast_retty<X, Y*>::ret_type = mlir::CallOp]: Assertion `isa(Val) && “cast() argument of incompatible type!”’ failed.

$ ../../../llvm-project/build/bin/mlir-opt  -normalize-memrefs  misc-ops-to-llvm_entrypoint.mlir
mlir-opt: /home/imaihal/docker/imaihal-ubuntu/work/llvm-project/llvm/include/llvm/Support/Casting.h:269: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = mlir::CallOp; Y = mlir::Operation; typename llvm::cast_retty<X, Y*>::ret_type = mlir::CallOp]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
PLEASE submit a bug report to and include the crash backtrace.
Stack dump:
0.      Program arguments: ../../../llvm-project/build/bin/mlir-opt -normalize-memrefs misc-ops-to-llvm_entrypoint.mlir 
 #0 0x000002aa2aa044e8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (../../../llvm-project/build/bin/mlir-opt+0x3044e8)
 #1 0x000002aa2aa02366 llvm::sys::RunSignalHandlers() (../../../llvm-project/build/bin/mlir-opt+0x302366)
 #2 0x000002aa2aa024fe SignalHandler(int) (../../../llvm-project/build/bin/mlir-opt+0x3024fe)
 #3 0x000002aa2cbf2efe 
 #4 0x000003ff9ddbdef4 raise (/lib/s390x-linux-gnu/
 #5 0x000003ff9ddbf37a abort (/lib/s390x-linux-gnu/
 #6 0x000003ff9ddb5ee4 (/lib/s390x-linux-gnu/
 #7 0x000003ff9ddb5f64 (/lib/s390x-linux-gnu/
 #8 0x000002aa2b3eb9c6 (anonymous namespace)::NormalizeMemRefs::updateFunctionSignature(mlir::FuncOp, mlir::ModuleOp) (../../../llvm-project/build/bin/mlir-opt+0xceb9c6)
 #9 0x000002aa2b3edcf6 (anonymous namespace)::NormalizeMemRefs::runOnOperation() (../../../llvm-project/build/bin/mlir-opt+0xcedcf6)
#10 0x000002aa2b3714b2 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager) (../../../llvm-project/build/bin/mlir-opt+0xc714b2)
#11 0x000002aa2b37160e mlir::detail::OpToOpPassAdaptor::runPipeline(llvm::iterator_range<llvm::pointee_iterator<std::unique_ptr<mlir::Pass, std::default_delete<mlir::Pass> >*, mlir::Pass> >, mlir::Operation*, mlir::AnalysisManager) (../../../llvm-project/build/bin/mlir-opt+0xc7160e)
#12 0x000002aa2b3795da mlir::PassManager::run(mlir::ModuleOp) (../../../llvm-project/build/bin/mlir-opt+0xc795da)
#13 0x000002aa2b340a2e performActions(llvm::raw_ostream&, bool, bool, llvm::SourceMgr&, mlir::MLIRContext*, mlir::PassPipelineCLParser const&) (.isra.26) (../../../llvm-project/build/bin/mlir-opt+0xc40a2e)
#14 0x000002aa2b340e76 processBuffer(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, bool, bool, bool, bool, mlir::PassPipelineCLParser const&, mlir::DialectRegistry&) (../../../llvm-project/build/bin/mlir-opt+0xc40e76)
#15 0x000002aa2b341044 mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, mlir::PassPipelineCLParser const&, mlir::DialectRegistry&, bool, bool, bool, bool, bool) (../../../llvm-project/build/bin/mlir-opt+0xc41044)
#16 0x000002aa2b341512 mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&, bool) (../../../llvm-project/build/bin/mlir-opt+0xc41512)
#17 0x000002aa2a910dfe main (../../../llvm-project/build/bin/mlir-opt+0x210dfe)
#18 0x000003ff9dda3aca __libc_start_main (/lib/s390x-linux-gnu/
#19 0x000002aa2a915454 _start (../../../llvm-project/build/bin/mlir-opt+0x215454)
#20 0x0000000000000000 
Aborted (core dumped)

- Example (Removed spv.EntryPoint ==> Normalized correctly)

$ cat misc-ops-to-llvm_entrypoint.mlir
#map0 = affine_map<(d0, d1) -> (d0 floordiv 32, d1 floordiv 64, d0 mod 32, d1 mod 64)>

module {
  func @empty() {
    %0 = alloc() : memref<10x10xf32, #map0>
//  spv.EntryPoint "GLCompute" @empty
$ ../../../llvm-project/build/bin/mlir-opt  -normalize-memrefs  mi

module {
  func @empty() {
    %0 = alloc() : memref<1x1x32x64xf32>

@imaihal Irrespective of the current support, this behavior is a bug. This should be easily fixable. Btw, what are the trailing *s at the end? Was this just a typo?

Drive by comment since I am not really familiar with the overall conversation topic here, but it is strange that you have spv.entry_point in the module. They should exist only in spv.module. So we have a missing verification there. How is that being added.

@bondhugula Thanks for your comment. Sorry, *s is typo. (I just tried to make the line bold)
Can you fix it? Or should I investigate more?

@MaheshRavishankar Thanks for checking. Sorry, this code might not be good example.My actual code issued similar error, but it is not appropriate to write here because it requires additional code of our own dialect.
My code put similar code about entry point outside of func within module.

#map0 = affine_map<(d0, d1) -> (d0 floordiv 32, d1 floordiv 64, d0 mod 32, d1 mod 64)>

module {
  func @empty() {
    %0 = alloc() : memref<10x10xf32, #map0>
  <I wanted to add some line here to reproduce my error>

@imaihal thanks for clarifying.

Side note though. If you are using SPIR-V dialect for OpenGL case, it would be interesting to know if you have any gaps that the SPIR-V dialect has for this. There has been some contributions to enable graphics mode in SPIR-V dialect, but more would be needed there I think. If there are specific things that you need for your use case, we can try to create tasks for the community to work on.

Please do go ahead to fix it - I won’t be able to get to this in the next few days.

OK. I’ll try.

I found the error happens NormalizeMemRefs.cpp#L268

When I removed the line ( spv.EntryPoint "GLCompute" @empt ) in the example, this loop NormalizeMemRefs.cpp#L265-L331 does not go through. I am considering whether I can avoid going through the loop even when inserting the line.

I created a patch to solve the error

@bondhugula I started thinking about how we can normalize dynamic memrefs, but I’m not sure how to do it. Could you tell me a bit more details about your suggestion?
I think dynamic memrefs in alloc op are solved in lowering to LLVM (-convert-std-to-llvm). Normalizing dynamic memrefs is possible in MLIR conversion(--normalize-memrefs)?

Yes, dynamic memrefs are all properly supported on the path to LLVM.

Normalizing alloc ops and load/store with dynamic memrefs should be straightforward. It’s the dim op that’s tricky as @AlexEichenberger points out upthread. Could you provide a simple example you have in mind for discussion?

@bondhugula This is an artificial example, but I would like to normalize this kinds of dynamic memrefs with affine_map.

#map0 = affine_map<(d0, d1) -> (d0, d1 floordiv 32, d1 mod 32)>

func @test_norm_dynamic(%arg0 : memref<?x256xf32, #map0>) -> () {
    %0 = alloc() : memref<?x256xf32, #map0>
    "test.op_norm"(%arg0, %0) : (memref<?x256xf32, #map0>, memref<?x256xf32, #map0>) -> ()
    dealloc %0 :  memref<?x256xf32, #map0>

What should I start from about this example?

You’ll need an argument for the alloc.

%0 = alloc(%N) : memref<?x256xf32>

The key step here is to deduce the sizes of the normalized memref. This is a simpler example and it’s going to be %N x 8 x 32. 8 and 32 are already given to you by the existing logic. The %N would map to a symbol in the constraint system. So this is to be replaced with:

%m = alloc(%N) : memref<?x8x32xf32>

All load/store op subscripts being multidimensional don’t see anything on the symbols binding to the ?s. So all of that replacement would work as is. So it’s just that you need to add size symbols to the constraint system used to compute the ranges and the upper bound for a normalized memref dimension will in general be a function of such symbols (instead of a constant).

@bondhugula Thanks! Sorry for late response. I resume working on normalizing dynamic memrefs.

In the previous my example, unknown dimension(?s) are not affected by normalizing. It was too simple compared with my actual case.
Excuse me again, but I would like to update my example as follows.

#map0 = affine_map<(d0, d1) -> (d0, d1 floordiv 32, d1 mod 32)>

func @test_norm_dynamic(%arg0 : memref<8x?xf32, #map0>) -> () {
    %0 = alloc() : memref<8x?xf32, #map0>
    "test.op_norm"(%arg0, %0) : (memref<8x?xf32, #map0>, memref<8x?xf32, #map0>) -> ()
    dealloc %0 :  memref<8x?xf32, #map0>

How can I normalize memref<8x?xf32, #map0>?

I think this is the answer you wrote before. If I can see an example of this affine function, it is very helpful for me.

For example, if the size of the memref has to be %N + 1 (where %N is the one corresponding to the symbol), this information is obtained from the constraint system and used to construct the AllocOp.

%S = affine.apply (d0) -> (d0 + 1) (%N)
%M = alloc(%S) : memref<?xf32>

As to the constraint system, you’ll for example have:

d0  s0  const  >=/== 0
-1    1     1    >= 0
# This means d0 <= s0 + 1.