Ibv-conduit: Verbs Work Request API#15
Open
PHHargrove wants to merge 9 commits into
Open
Conversation
Use struct assignment for marginally better clarity: the intent is to clone the completion.
This commit adds a `gasnetc_signal_flag` file-scope variable to replace repeated computation (based on the `GASNET_USE_FENCED_PUTS` environment var) when fenced puts are enabled at configure time. When fenced puts are disabled at configure time, the identifier `gasnetc_signal_flag` is instead `#defined` to 0. Additionally, `gasnetc_op_needs_fence_mask` is also now `#defined` to 0 when fenced puts are disabled at configure time.
This commit moves checks on valid `opcode` and a callback invariant from `gasnetc_snd_post_common()` to `gasnetc_snd_validate()`, as well as removing a redundant check for non-NULL `sreq->cep`. Since `gasnetc_snd_validate()` is called by `gasnetc_snd_post_common()`, there is no loss of error checking. This is preparation for multiple callers of `gasnetc_snd_validate()`, all of which should include the checks which this commit moves.
This commit replaces `gasnetc_snd_post_common()` (and the two macros `gasnetc_snd_post()` and `gasnetc_snd_post_inline()` which call it with defaulted arguments) with a serious of per-operation functions: + `gasnetc_post_send_imm()` + `gasnetc_post_write()` + `gasnetc_post_read()` + `gasnetc_post_fetch_add()` + `gasnetc_post_cmp_swp()` Note that currently, these all call the (slightly modified, and now file-scoped) `gasnetc_snd_post_common()` function. However, the new op-specific layer leads to improvements in stats/trace support. Additionally, there is now better opportunity for constant propagation if the compiler chooses to specialize `gasnetc_snd_post_common()`. There are a few non-trivial changes to the callers to marshal arguments differently. While this may appear "arbitrary" at this stage, these changes will enable support (soon) for the Verbs Work Request API.
This commit adds a configure probe for the Verbs Work Request API (`ibv_wr_start()`, et al.). An implementation will follow.
This commit adds the necessary logic in `gasnetc_create_qp()`, which was previously just an argument-marshaling macro, to create queue pairs capable of supporting the Verbs Work Request API. Use of that API will follow.
This commit updates ibv-conduit to use the Verbs Work Request API (`ibv_wr_start()`, et al.) if available and not disabled at configure time. This is a prerequisite for planned DC transport support, but is also touted by Mellanox/NVIDIA as having lower latency than the original `ibv_send_post()` API.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR is the next step toward DC (Dynamic Connection) support in IBV conduit.
It also represents a "modernization" of how we use libibverbs.
See https://man.archlinux.org/man/ibv_wr_post.3.en for the man page describing the Verbs Work Request API.
Status
Well tested on the CGPU nodes, including FAST and EVERYTHING modes both with and without the new
--disable-ibv-wr-api.While vendor docs suggest the new API should be marginally faster (and less memory-bus intensive) than the "legacy" mechanism, my testing is inconclusive. Tests on CGPU are consistent with those claims, but the measured improvements are too small relative to the standard deviations to be statistically significant (thus "inconclusive"). Notably I did not see evidence of any performance degradation in the data I have collected so far.
Ready for review.
Concurrent with code review, I will continue performance evaluation and extend testing to JLSE and Wombat.
Commits (reverse chronological order)
ibv: document --disable-ibv-wr-api
ibv: add an ident string for work request API
ibv: use the Verbs Work Request API when available