gen.local_id: query a work-item's local id
gen.work_group_id: query the id of a work-item's work-group
gen.work_group_size: query the size of a work-item's work-group
gen.num_work_groups: query the number of work-groups
gen.barrier: work-group barrier
gen.sub_group_shuffle: sub-group shuffle
These specific ops are already present in GPU dialect, I believe, including SPIR-V lowering suitable for Intel GPUs. It’s not clear for me why do we need to duplicate ops definition and lowering in another place. It will be more productive to add llvm lowering, suitable for LLVM SPIR-V backend/Khronos SPIR-V translator, for the existing GPU dialect ops.