Skip to content

Deterministic dace toolchain draft#17

Open
tehrengruber wants to merge 2 commits intobranch_gt4py-next-integration_2026_02_12from
dace_toolchain_deterministic
Open

Deterministic dace toolchain draft#17
tehrengruber wants to merge 2 commits intobranch_gt4py-next-integration_2026_02_12from
dace_toolchain_deterministic

Conversation

@tehrengruber
Copy link
Copy Markdown

Just a basis for further discussions.

Copy link
Copy Markdown

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first high level review of the changes.

Comment thread dace/sdfg/graph.py
self._dst = dst
self._data: T = data

def id(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several things here.
First of all this type of edge is only for edges between states or to be precise control flow regions.
Inside a state, there is a second kind of graph, the dataflow graph.
down in the file you will find the MultiEdge (which is probably irrelevant) and the MultiConnectorEdge (which is the relevant), which are used for the dataflow graph.
In addition these other kind of edges are actually missing the id() function.

Then, you have added the id property to the Node, however, this is only the base class for the the nodes of the dataflow graph.
Thus, calling this function will fail, because the nodes, which are control flow regions do not have that attribute.
You must add them either to AbstractControlFlowRegion or what is probably better to ControlFlowBlock, but in that regard I am not fully sure.

Furthermore, two nodes can have multiple connections between them.
This is more relevant for the dataflow graph, however, it is technically still allowed for the state graph, although less relevant.
Thus, edge.id() is not an unique key, as there could be several edges between any two nodes.
To make them unique, you need to include the .data property, which is either a Memlet or an InterstateEdge.

Comment thread dace/sdfg/graph.py
@@ -215,7 +218,7 @@ def __getitem__(self, node: NodeT) -> Iterable[NodeT]:

def all_edges(self, *nodes: NodeT) -> Iterable[Edge[EdgeT]]:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def all_edges(self, *nodes: NodeT) -> Iterable[Edge[EdgeT]]:
def all_edges(self, *nodes: NodeT) -> List[Edge[EdgeT]]:

Comment thread dace/sdfg/nodes.py
return current


# TODO: check if the edges need an id. If source and dest, are equal they should be interchangable right?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, see my comment above.
Essentially the code:

a[3] = b[4]
a[6] = b[5]

Will lead to an SDFG with two nodes one for a and one for b and two edges between them, each does an assignment.
There are similar situations with Maps.

Comment thread dace/sdfg/nodes.py
"type": typestr,
"label": labelstr,
"attributes": dace.serialize.all_properties_to_json(self),
# TODO(tehrengruber): This id looks very similar to the ID I introduced.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not stable it is essentially the position in the sorted node array, see here.
This design has some very nasty consequences in the way how transformations have to be implemented.
This is the transformation why your transformations look like:

def apply():
    matched_node = self.pattern_node
    # Never use `self.pattern_node` from here always use `matched_node` and pass it to all functions!

Comment thread dace/sdfg/sdfg.py
##############################

# TODO(tehrengruber): This could also be done using the counter.
def _find_new_name(self, name: str):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it would probably better, because not all symbol (names) are stored in self.symbols, for example the symbols used as Map parameters are not inside it.

Comment thread dace/sdfg/state.py
"""
return set()

# TODO(tehrengruber): should we make this an ordered set?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually in favour of that, however, there are some issues.
This function exists because SymPy had it and a lot of other classes have such a method (although it is most likely implemented by used_symbols()).
Thus you will have to change them all.
Furthermore, while here the return type is set[str] this is actually wrong.
Some objects, especially the ones that are very close to SymPy, return a set of dace.symbolic.symbol (which extends (but does not wrap) SymPy's symbol) and sometimes you even get a mixture of them.
So this is a big set of work.

Comment on lines +90 to +91
block_reach: Dict[int, Dict[ControlFlowBlock, OrderedSet[ControlFlowBlock]]]) -> Set[SDFGState]:
closure: Set[SDFGState] = OrderedSet()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
block_reach: Dict[int, Dict[ControlFlowBlock, OrderedSet[ControlFlowBlock]]]) -> Set[SDFGState]:
closure: Set[SDFGState] = OrderedSet()
block_reach: Dict[int, Dict[ControlFlowBlock, OrderedSet[ControlFlowBlock]]]) -> OrderedSet[SDFGState]:
closure: OrderedSet[SDFGState] = OrderedSet()

"""
single_level_reachable: Dict[int, Dict[ControlFlowBlock,
Set[ControlFlowBlock]]] = defaultdict(lambda: defaultdict(set))
OrderedSet[ControlFlowBlock]]] = defaultdict(lambda: defaultdict(set))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OrderedSet[ControlFlowBlock]]] = defaultdict(lambda: defaultdict(set))
OrderedSet[ControlFlowBlock]]] = defaultdict(lambda: defaultdict(OrderedSet))

# By definition, data that is referenced by the conditions (branching condition,
# loop condition, ...) is not single use data, also remove that.
for cfr in sdfg.all_control_flow_regions():
single_use_data.difference_update(cfr.used_symbols(all_symbols=True, with_contents=False))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used_symbols() also returns a set.

array_name = an.data
write_subsets = set(e.data.dst_subset for e in st.in_edges(an))
write_subsets = OrderedSet(e.data.dst_subset for e in st.in_edges(an))
wss = str(write_subsets)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know outside of the scope of this PR, but serializing a set and then use it as keys does not make sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants