[RELAY][AUTOTVM] Extract tuning tasks from Relay programs#2181
[RELAY][AUTOTVM] Extract tuning tasks from Relay programs#2181tqchen merged 14 commits intoapache:masterfrom
Conversation
|
If someone has pointers on the current way to do relay inference, we can also add relay versions of the |
|
I am also porting the |
|
|
||
| logger = logging.getLogger('autotvm') | ||
|
|
||
| def serialize_args(args): |
There was a problem hiding this comment.
We can remove this by adding custom json encoder and decoder in https://github.com/dmlc/tvm/blob/master/python/tvm/autotvm/record.py#L54
| # NOTE: To add more ops, you only need to change the following lists | ||
| # relay op -> topi compute | ||
| self.op2topi = { | ||
| relay.op.nn.conv2d: [topi.nn.conv2d, topi.nn.depthwise_conv2d_nchw, |
There was a problem hiding this comment.
This op2topi is the only different part between TaskExtractEnv for nnvm and relay.
I think we can move TaskExtractEnv to topi_integration.py, and let nnvm and relay part pass in this argument.
There was a problem hiding this comment.
The signatures are also a little different because nnvm and relay require different parameters (e.g., relay gets param dict, nnvm requires shape). How should we handle this?
There was a problem hiding this comment.
I am referring to the class TaskExtractEnv, which is almost TOPI level.
There was a problem hiding this comment.
Ok, good call, we can merge these.
There was a problem hiding this comment.
One thing to notice is that we pass params to relay.build, which means we will do constant folding and call tvm functions.
Then the python multiprocessing package will become invalid and we cannot tune with multiprocessing. Do you notice this in your tuning? A possible workaround is to run relay.build in a separate thread.
There was a problem hiding this comment.
The CI breaks now after merging in the test when nnvm is imported. Is this a circular import?
There was a problem hiding this comment.
Is the issue that the CI script does not add NNVM to the pythonpath for relay tests?
https://github.com/dmlc/tvm/blob/master/tests/scripts/task_python_integration.sh
There was a problem hiding this comment.
I think so. Because this is a tvm test. This's why I put the original test to nnvm folder.
There was a problem hiding this comment.
Ok, the reason I put it there was because it was a "relay/autotvm" test which doesn't really have anything to do with nnvm. I can move it to the nnvm directory.
There was a problem hiding this comment.
No, we should not move it to nnvm. See my comment below
| @@ -54,6 +54,7 @@ class TaskExtractEnv: | |||
| def __init__(self): | |||
There was a problem hiding this comment.
We should do better decoupling here, as we will drop nnvm in the future. This can also solve the import problem.
Move TaskExtractEnv to topi_integration.py,
Move all import nnvm to nnvm_integration.py,
Move all import relay to relay_integration.py.
test nnvm_integration.py in nnvm/test
test relay_integration.py in tvm/test
There was a problem hiding this comment.
ok, I will try to refactor for that
There was a problem hiding this comment.
Actually, is there any reason we use relay/nnvm functions to specify which operators to match? Could we just use the topi functions directly or some syntax sugar otherwise?
Currently this seems to introduce some circular import problems because if we do import relay at the top level (as relay does import autotvm). We can hack this by moving the imports to within functions, but I am not sure that is ideal.
There was a problem hiding this comment.
Because one operators conv can correspond to many topi functions.
We have to move import nnvm, import relay inside functions as the old code does.
|
Overall this looks good to me, one question is there a reason we need to fully build the Relay function with the tracing target? isn't the set of operators just a statically inferable property? Can we not extract the type information and call sites from the program and use that as our tuning work load? |
|
I'm not sure about the details here, but is TOPI a blackbox from the perspective of Relay? It seems we would still need to go through TOPI's dispatch system even if we had complete information for Relay's side. But overall this is just a case of "don't fix it if ain't broke" as this implementation is basically borrowed from the NNVM version. |
|
We want to keep the autotvm tuning tasks at TOPI level. Using tracing mode can make the extraction irrelevant to graph ir. So one version of code works for both nnvm and relay. Besides, the dispatching mechanism from relay op to topi is complicated. For example, one relay op can correspond to many topi functions. (e.g., a conv op may call vanila conv, group conv, or depthwise conv in topi). On vta, one op can be run on either CPU or FPGA. |
|
Not sure what the build error is here; do I need to do a rebase or just retrigger CI? |
|
Looks good to me. Maybe I will do a follow up refactor for VTA. |
|
Sorry, can you add some test cases? |
| topi.nn.group_conv2d_nchw], | ||
| nnvm.sym.conv2d_transpose: [topi.nn.conv2d_transpose_nchw], | ||
| nnvm.sym.dense: [topi.nn.dense], | ||
| } |
There was a problem hiding this comment.
can we combine these two SYMBOL2TOPI ?
There was a problem hiding this comment.
Perhaps; @merrymercy should we do this? There have also been discuss board questions from people who didn't know to add conv2d_transpose to the tutorial example when tuning other networks.
There was a problem hiding this comment.
@yzhliu is referring to combine L52 and L123.
This is not related to the problem in the forum.
There was a problem hiding this comment.
The reason we don't combine them currently seems to be because the imports here are tricky to do outside of a function. Should we just delete the extract_from_multiple_graphs function? I do not see it used often.
There was a problem hiding this comment.
I am using it. Maybe we can create another function for these lists
| topi.nn.group_conv2d_nchw], | ||
| tvm.relay.op.nn.conv2d_transpose: [topi.nn.conv2d_transpose_nchw], | ||
| tvm.relay.op.nn.dense: [topi.nn.dense], | ||
| } |
|
@yzhliu please take another look at the pr |
|
Thanks, @yzhliu @eqy @merrymercy this is merged. |
This is a copy-paste of nnvm task extraction for relay for autotvm tuning.
CC @merrymercy @jroesch