[CI] run pytest in parallel#18146
Conversation
|
Hey @szha , Thanks for submitting the PR
CI supported jobs: [website, clang, centos-gpu, unix-cpu, windows-cpu, unix-gpu, sanity, windows-gpu, centos-cpu, edge, miscellaneous] Note: |
7103615 to
774f703
Compare
c183ced to
98d7154
Compare
7781d4c to
fddadd1
Compare
fb69023 to
ac43291
Compare
leezu
left a comment
There was a problem hiding this comment.
Thanks.
I find that the runtime of some tests is increased, maybe due to thrashing, overhead of the threadedengine or other problems. Though this can be improved in separate PR.
For example
505.28s call tests/python/unittest/test_optimizer.py::test_sparse_adam4
and
460.96s call tests/python/unittest/test_optimizer.py::test_sparse_adam
on last two unix-cpu runs in this PR in (Python3 MKL-CPU) but on master
150.76s call tests/python/unittest/test_optimizer.py::test_sparse_adam
|
will look into it in a follow-up PR. |
|
@PatricZhao I noticed that the MKL/MKLDNN tests are taking a lot longer than non-MKL builds in the parallel test setting. I will try to run a couple more times to verify so this is just FYI. Example: |
|
Update: the MKLDNN builds are actually executing a different set of tests which could explain the time difference. However, the MKL build is indeed executing the same unittest as regular python 3 CPU build and it's consistently taking a lot longer. |
|
Could that be due to cold booz? Doesn't mkl generate kernels the first time
you invoke them?
Sheng Zha <notifications@github.com> schrieb am Di., 5. Mai 2020, 21:13:
… Update: the MKLDNN builds are actually executing a different set of tests
which could explain the time difference. However, the MKL build is indeed
executing the same unittest as regular python 3 CPU build and it's
consistently taking a lot longer.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#18146 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEOED27YF6LHD3T3K2GP7N3RQBQNBANCNFSM4MOZZ7VQ>
.
|
|
@marcoabreu indeed that could be a likely cause. I reported my findings in #18244 and we can continue the discussion there. |
* run pytest in parallel * disable memory pool * address flaky ftrl/fm test and layernorm timeout * mark tests as serial * use parametrize in numpy op tests * fix io bugs * fix gluon rnn cell test and doc * replace xfail with raises scope * fix flaky numpy, mkldnn quantize, and rnn tests * fix tempfile/dir usage
Description
run pytest in parallel
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes