I have been experimenting with adding xilinx primitive supports in circt. I wonder if people around here thinks there is much value bring it into circt? I think it would make it tremendously easier to write dialects / passes that makes use of xilinx primitives directly.
My experimental branch will allow circt to compile the following IR:
hw.module @Foo(%a: i1) -> (%o: i1) {
// LUT6 is a xilinx primitive for a single-output LUT
%1 = xilinxPrimitives.LUT6 %a, %a, %a, %a, %a, %a { INIT=0x0 } : i1
// LUT6_2 is a xilinx primitive for a 2-output LUT...
%2, %3 = xilinxPrimitives.LUT6_2 %a, %a, %a, %a, %1, %1 { INIT=0x8000000000000000 } : (i1, i1)
hw.output %1 : i1
}
into verilog instantiations of FPGA primitives:
% ./bin/circt-opt --lower-xilinx-primitives-to-hw ./a.mlir | ./bin/circt-translate -export-verilog
// external module LUT6_2
// external module LUT6
module Foo(
input a,
output o);
wire prim_1_O5; // <stdin>:6:30
wire prim_1_O6; // <stdin>:6:30
wire prim_0_O; // <stdin>:5:17
LUT6 #(
.INIT(64'd0)
) prim_0 ( // <stdin>:5:17
.I0 (a),
.I1 (a),
.I2 (a),
.I3 (a),
.I4 (a),
.I5 (a),
.O (prim_0_O)
);
LUT6_2 #(
.INIT(64'd9223372036854775808)
) prim_1 ( // <stdin>:6:30
.I0 (a),
.I1 (a),
.I2 (a),
.I3 (a),
.I4 (prim_0_O), // <stdin>:5:17
.I5 (prim_0_O), // <stdin>:5:17
.O5 (prim_1_O5),
.O6 (prim_1_O6)
);
assign o = prim_0_O; // <stdin>:5:17, :7:5
endmodule
Note that the pass implicitly inserts hw.module.extern
declarations!
The steps from “specification” to (hopefully) synthesizable verilog:
- Get the open-source Xilinx Unisim verilog files (Apache license). The verilog modules that simulates Xilinx primitives are annotated with
celldefine
- Write a simple verilog parser that picks out the relevant information (parameters, I/O ports). The parser will generate a ODS for the XilinxPrimitives dialect, ie: XilinxRawPrimitives.td, and a C++ file that describes all the I/O ports ie: XilinxRawPrimitives.cpp
- Create a pass that lowers xilinxPrimitives::{LUT3, LUT4, …} into simple hw.instance (Thank you MLIR for this dynamism)
- Generate verilog as usual. In fact, in my experiments, i did not have to modify ExportVerilog.cpp
Notably, there is no need for anyone to manually transcribe anything from pdf → MLIR I suspect this approach is not hard to extend to other FPGA vendors.
There are still some pending issues from my experimentation.
- On my laptop,
libCIRCTXilinxPrimitives.a
is 110MB. I think this can be reduced by half by sharing the verifier, but I can’t see going much lower easily. - Compilation times of the generated files are slow
- I haven’t tested this with
xsim
yet. - For reasons I have not investigate, mlir-tablegen incorrectly handles
def PCIE_3_0:...
it generates a3_0
class instead ofPCIE_3_0
. - the HW Dialect requires instantiations to all have names. This doesn’t really mean much for primitives.
- idk if there are performance implications with having 300+ ops. A small number of them (PCIe blocks, GTH…) have >30 I/O ports. I probably would just filter out a small set of operations for a start…
- I am unsure if adding a partial verilog parser is a good idea, maybe it’d make sense to clean up the files and put them at a repo somewhere else and be done with it (they probably will never change).