[Unity][TVMScript] Avoid dangling reference when printing Call attrs#15923
Merged
masahi merged 1 commit intoapache:unityfrom Oct 13, 2023
Merged
[Unity][TVMScript] Avoid dangling reference when printing Call attrs#15923masahi merged 1 commit intoapache:unityfrom
masahi merged 1 commit intoapache:unityfrom
Conversation
Prior to this commit, the `tvm::script::printer::AttrPrinter` class
took the attribute path as a `const ObjectPath&`. In both places
where an `AttrPrinter` is called, the temporary object
`n_p->Attr("attrs")` is passed for this argument. While binding a
temporary object to a const reference can extend the lifetime of the
temporary, this requires the const reference to be in the same scope
as the temporary, and does not apply in this case (see [this
stackoverflow post](https://stackoverflow.com/a/2784304)). Therefore,
this reference is only valid through the construction of `AttrPrinter
printer`, and is invalid during its usage on the following line.
This dangling reference has caused segfaults in CI for unrelated
changes ([example](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-unity/detail/PR-15904/3/pipeline)),
and can be reproduced with the following test case.
```python
import pytest
from tvm.script import relax as R
@pytest.mark.parametrize("iter", range(10000))
def test_argmax_without_specified_axis(iter):
@R.function
def func(x: R.Tensor((1, 2, 3, 4), "float32")):
return R.argmax(x)
func.script(show_meta=True)
```
This test case is not included in this commit, as the reproduction is
not consistent, with failure requiring on the order of 10k iterations
to trigger. In addition, reproduction was sensitive to the following
conditions.
* The function being printed must contain at least one `relax::Call`
node, with an operation that has attributes.
* TVM must be built with optimization enabled. In gcc, the
`-ftree-dse` optimization, which is part of `-O1`, is required to
trigger the bug.
* Python's default allocation must be used. If `PYTHONMALLOC=malloc`
is set to instead use the system's `malloc`, the segfault was no
longer triggered.
This commit updates `AttrPrinter` to accept the `ObjectPath` by value.
With the change applied, the above test ran 100k times without error.
masahi
approved these changes
Oct 13, 2023
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Prior to this commit, the
tvm::script::printer::AttrPrinterclass took the attribute path as aconst ObjectPath&. In both places where anAttrPrinteris called, the temporary objectn_p->Attr("attrs")is passed for this argument. While binding a temporary object to a const reference can extend the lifetime of the temporary, this requires the const reference to be in the same scope as the temporary, and does not apply in this case (see this stackoverflow post). Therefore, this reference is only valid through the construction ofAttrPrinter printer, and is invalid during its usage on the following line.This dangling reference has caused segfaults in CI for unrelated changes (example), and can be reproduced with the following test case.
This test case is not included in this commit, as the reproduction is not consistent, with failure requiring on the order of 10k iterations to trigger. In addition, reproduction was sensitive to the following conditions.
The function being printed must contain at least one
relax::Callnode, with an operation that has attributes.TVM must be built with optimization enabled. In gcc, the
-ftree-dseoptimization, which is part of-O1, is required to trigger the bug.Python's default allocation must be used. If
PYTHONMALLOC=mallocis set to instead use the system'smalloc, the segfault was no longer triggered.This commit updates
AttrPrinterto accept theObjectPathby value. With the change applied, the above test ran 100k times without error.