Make gpu thread and block for loop names opaque#8133
Merged
Conversation
This is one of our largest remaining type of magic name. These were explicitly constructed in lots of places and then explicitly checked for with ends_with in lots of places. This PR makes the names opaque. Only CanonicalizeGPUVars.cpp knows what they are, and they don't have to be a single fixed thing as long as they're consistent within a process. Also reduced the number of GPU dimensions to three more uniformly. We were already asserting this, but there was lots of dead code in lowering passes after gpu loop validation that allowed for four. Also fixed a bug I found in is_block_uniform. It didn't consider that the dependence on a gpu thread variable in a load index could be because a let variable encountered depends on a gpu thread variable.
Member
Author
|
Tagging @derek-gerstmann for review for this one because I made non-trivial changes to the vulkan backend. |
derek-gerstmann
approved these changes
Mar 4, 2024
| Stmt body = mutate(op->body); | ||
|
|
||
| Expr var = Variable::make(Int(32), "." + thread_names[dim]); | ||
| Expr var = Variable::make(Int(32), gpu_thread_name(dim)); |
Contributor
There was a problem hiding this comment.
Should this be a UInt(32) ?
Member
Author
There was a problem hiding this comment.
I don't think so. Our loop variables are currently always Int(32).
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is one of our largest remaining type of magic name. These were explicitly constructed in lots of places and then explicitly checked for with ends_with in lots of places.
This PR makes the names opaque. Only CanonicalizeGPUVars.cpp knows what they are, and they don't have to be a single fixed thing as long as they're consistent within a process.
Also reduced the number of GPU dimensions to three more uniformly. We were already asserting this, but there was lots of dead code in lowering passes after gpu loop validation that allowed for four.
Also fixed a bug I found in is_block_uniform. It didn't consider that the dependence on a gpu thread variable in a load index could be because a let variable encountered depends on a gpu thread variable.