Promote halide_malloc_alignment() to be user-overridable#7204
Promote halide_malloc_alignment() to be user-overridable#7204steven-johnson wants to merge 1 commit intomainfrom
Conversation
Currently, halide_malloc_alignment() is WEAK_INLINE, so even if you attempt to override it, it won't have any effect (the call sites in halide_malloc() will have inlined the implementation selected by LLVM_Runtime_Linker). This promotes halide_malloc_alignment() to a documented piece of the runtime you can override, with the caveat that it must return the same value every time it is called. To reduce overhead in halide_malloc(), we copy the value into a global at startup time and use that instead. By itself, this isn't that useful, but it's essential to have in order for #7189 to be robust; I'm pulling this change into a separate PR so that it can land separately, enabling downstream users to be able to update their halide_malloc() implementations ahead of the "real" change.
|
Overhead looks minimal, but why is this essential for the other PR to be robust? |
There exists code (e.g. apps/hannk) that calls Now imagine that you run such a piece of code that that uses the default arm alignment of 32, but overrides halide_malloc() in a way that uses (say) align=8. Currently, you have no way to modify what halide_malloc_alignment() returns -- the decisions are hardcoded in LLVM_Runtime_Linker -- and because it's WEAK_INLINE, you can't change the value. I agree that having the constants be "compile-time" constants as we have now is preferable, but if we agree that it's desirable to allow user code to know the alignment contract that the current halide_malloc() implementation is using, I'm not sure of a better way to accomplish this. I'm definitely open for better suggestions, of course. |
|
What do you think of adding a setter instead of a weak-overridable function? It would work on more platforms. |
|
Update: apparently initializing a global to the result of a call doesn't work in JIT mode for the qurt runtime (i.e.:
Hmm... that seems like it would work. Let me take a stab at it. (This whole alignment effort is turning into a bigger can of worms than I expected.) |
So this definitely can be made to work, but it feels pretty fragile. The problem here is that there are several ways to override the Now, to be fair... we already have a setup in which there is no enforcement that
|
|
(Side note: why the heck do we have both |
|
Finally, there is another option I overlooked: punt on the whole problem entirely. At present, there is exactly one use case that I know of for using This is likely the best short-term option. I'll prep a (separate) PR to try that out. |
|
I'm going to close this in favor of an alternate solution. |
Currently, halide_malloc_alignment() is WEAK_INLINE, so even if you attempt to override it, it won't have any effect (the call sites in halide_malloc() will have inlined the implementation selected by LLVM_Runtime_Linker).
This promotes halide_malloc_alignment() to a documented piece of the runtime you can override, with the caveat that it must return the same value every time it is called. To reduce overhead in halide_malloc(), we copy the value into a global at startup time and use that instead.
By itself, this isn't that useful, but it's essential to have in order for #7189 to be robust; I'm pulling this change into a separate PR so that it can land separately, enabling downstream users to be able to update their halide_malloc() implementations ahead of the "real" change.