Skip to content

[mlir][dxsa] Add bfi instruction#201

Open
hvdijk wants to merge 9 commits into
dxsa-mlirfrom
users/hvdijk/dxsa-mlir-bfi
Open

[mlir][dxsa] Add bfi instruction#201
hvdijk wants to merge 9 commits into
dxsa-mlirfrom
users/hvdijk/dxsa-mlir-bfi

Conversation

@hvdijk

@hvdijk hvdijk commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

This commit generalises the DXSA_*Op to allow arbitrary numbers of arguments.

This commit generalises the DXSA_*Op to allow arbitrary numbers of arguments.
@hvdijk hvdijk requested a review from tagolog June 25, 2026 00:51
@hvdijk

hvdijk commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

I'm not sure if TableGen gives us a better way to avoid having to spell out a separate class for every possible number of arguments. Does this approach look okay or should we do it differently?

@tagolog tagolog requested a review from asl June 25, 2026 04:01
DXSA_DstOperandAttr:$dst,
DXSA_SrcOperandAttr:$src,
OptionalAttr<DXSA_ComponentMaskAttr>:$precise);
class DXSA_PlainOp<string mnemonic, int dsts, int srcs> : DXSA_Op<mnemonic> {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asl could you take a look at the way the op is built here?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvdijk, Thank you for the prototype. I would rather not land the generalized version of the DXSA_PlainOp. There are several reasons:

  • Multi destination operations need semantic operand names and appropriate getter names:
    • sincos -> $sin, $cos, $operand
    • imul / umul -> $hi, $lo, $lhs, $rhs
    • udiv -> $quot, $rem, $lhs, $rhs
    • uaddc -> $sum, $carry, $lhs, $rhs
    • usubb -> $diff, $borrow, $lhs, $rhs
    • swapc -> $dst0, $dst1, $cond, $src0, $src1
      A generic DXSA_PlainOp gives getDst0()/getDst1()/getLhs()/getRhs() for all of them and loses the role. The only ways out are (a) more parametrisation by an explicit name list, or (b) keeping per-shape local classes anyway. We already do (b) for DXSA_SincosOp and DXSA_AtomicOp; it is 5-8 lines per shape.
  • The chain of !foldl/!setdagarg/!interleave/!range is effectively unreadable once written: any mistake shows up as a generated-C++ error with no trail back to the TableGen expression that caused it.
  • No MLIR dialect uses this idiom. SPIR-V, Arith, Math, MemRef, Vector, Affine, LLVM dialect, P4HIR, IREE VM all uses hand-written tablegen classes.
  • About 225 operations (arithmetic, FP, bitwise, condition, conv, atomic) - Unary, Binary, Ternary gain nothing from generalisation. 13 operations with no operands can be covered with 5 lines of clear code - new DXSA_NoOperandOp operation (see [mlir][dxsa] Add control flow instructions #194). Multi destination arithmetic needing its own local class for semantic names - about 7 operations.

@hvdijk hvdijk Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi destination operations need semantic operand names and appropriate getter names:

As you mentioned, this can be done by more parametrisation, and that simplifies the TableGen logic further. I've done that now.

The chain of !foldl/!setdagarg/!interleave/!range is effectively unreadable once written: any mistake shows up as a generated-C++ error with no trail back to the TableGen expression that caused it.

This is not true. Errors show up in TableGen, and the usual TableGen techniques apply (running TableGen without any -gen options to display the records). Where you do get confusing C++ errors is when you use this to define ops with the wrong parameters, that is handled terribly in the rest of the code, but that's not a result of this PR, that's already an issue today.

No MLIR dialect uses this idiom.

Do any other MLIR dialects have the same situation to deal with, where there are global operands that apply to all instructions, combined with instruction-specific operands that only apply to the specific instruction? If so, how do they define the global operands in a single place?

About 225 operations (arithmetic, FP, bitwise, condition, conv, atomic) - Unary, Binary, Ternary gain nothing from generalisation.

The unary, binary and ternary instructions gain very little (I wouldn't say nothing, the definition is shorter, but the benefit is minimal), but this fixes the buggy definition of the atomic instructions with no extra work.

13 operations with no operands can be covered with 5 lines of clear code - new DXSA_NoOperandOp operation

I already explained why I think this is a bad idea ("no operand op" isn't general enough to be useful, it doesn't handle the case of instructions that have source operands but no destination operands)

Sorry, this review gives me no path to continue, so I'm going to continue with this path whether it gets merged or not (edit: unless I find a better way of doing this, but I haven't yet). If that ends up a waste of time, so be it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to take Andrew's suggestion from our call, this happens to also avoid the most complicated of the TableGen expressions.

@hvdijk

hvdijk commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Updated to cover the now-merged PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants