[RFC] PC64le - do not use the patched functions as callbacks directly#789
Conversation
|
The current patch applies on top of @euspectre patch. I have included the patch from @euspectre tree in this pull request for readability and testing purpose. I will redo the pull request, with my patch alone, in the final version. |
kpatch-build/create-diff-object.c
Outdated
| toc_dynrela_off = malloc(sizeof(int) * (nr/2)); | ||
| memset(toc_dynrela_off, 0, sizeof(*toc_dynrela_off)); | ||
| off_idx = 0; | ||
|
|
There was a problem hiding this comment.
We should check for malloc failure here: if (!toc_dynrela_off) ERROR("malloc") ... also isn't that memset going to only clear the first offset value (ie, toc_dynrela_off[0] and not toc_dynrela_off[0..n]) ?
There was a problem hiding this comment.
nitpick: also maybe the malloc should be sizeof(*toc_dynrela_off) * (nr/2) ?
There was a problem hiding this comment.
Thanks for the review. Will include the suggestions in next iteration.
|
How much of the added code in |
It's a prototype for comments. Changes introduced to |
|
I'm confused about why we have to add a new .toc entry. Why can't we just convert those .toc.rela function ptr relas to dynrelas? |
I got little mixed with details and implemented it with duplicating |
464e608 to
d5eca81
Compare
|
With #791, Will wait for the comments on #791 and on this pull-request, before re-spinning v3. |
kpatch-build/create-diff-object.c
Outdated
| continue; | ||
| } | ||
| } | ||
| #else |
There was a problem hiding this comment.
The control flow is way too complex and hard to follow.
What if we add a rela->needs_dynrela bool, which gets set in the first loop which increments the nr variable? Then we could move the may_need_dynrela() call there, and that would hopefully vastly simplify this code (e.g., no need for the toc_dynrela_off array or the spaghetti code flow).
There was a problem hiding this comment.
clever :). Will re-do the patch based on your suggestion.
d5eca81 to
465ec27
Compare
|
v3 Changes:
|
| } | ||
| if (may_need_dynrela(rela)) | ||
| toc_rela(rela)->need_dynrela = 1; | ||
| } |
There was a problem hiding this comment.
This is a lot better than before, but I'm still confused by this code. Why are all the toc special cases needed in function_ptr_rela(), may_need_rela(), and here? This code loops through all the sections, including .rela.toc., so can we just do something like
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
rela->need_dynrela = 1;
}
along with removing all the toc_rela() calls in function_ptr_rela() and may_need_rela()?
There was a problem hiding this comment.
Relocation section '.rela.toc' at offset 0xcc6b0 contains 46 entries:
...
0000000000000128 0000353400000026 R_PPC64_ADDR64 0000000000000000 genl_sk_destructing_waitq + 0
0000000000000130 0000003e00000026 R_PPC64_ADDR64 0000000000000000 .text.netlink_release + 35c
0000000000000138 0000002a00000026 R_PPC64_ADDR64 0000000000000000 .text.deferred_put_nlk_sk + 8
Relocation section '.rela.text.netlink_release' at offset 0xcadf0 contains 44 entries:
...
0000000000000390 000035580000000a R_PPC64_REL24 0000000000000000 __local_bh_enable_ip + 0
0000000000000398 0000007300000032 R_PPC64_TOC16_HA 0000000000000000 .toc + 138
00000000000003a0 0000007300000040 R_PPC64_TOC16_LO_DS 0000000000000000 .toc + 138
When looping with:
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
rela->need_dynrela = 1;
}
netlink_release + 0x3a0 is not a function symbol, but of relocation type (R_PPC64_TOC16_HA/R_PPC64_TOC16_LO_DS)
referring to symbol in .rela.toc + 0x138, We need to operate on both rela->type and rela->sym->sec->rela + rela->addend, where netlink_release + 0x3a0 is checked for rela-type and .text.deferred_put_nlk_sk + 8 for STT_FUNC/switch label, but the need_dynrela should be set only for the .rela.toc symbol.
Without toc_rela() netlink_release + 0x3a0 referring to .text.deferred_put_nlk_sk + 8 symbol's need_dynrela will not be set, failing the check for STT_FUNC in function_ptr_rela() and while parsing .rela.toc section it's not obvious, if .text.deferred_put_nlk_sk + 8 need_dynrela to be set, because we can not determine if it's function ptr.
Alternatively, the above loop can be re-written as:
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
toc_rela(rela)->need_dynrela = 1;
}
With an additional check in may_need_dynrela():
static int may_need_dynrela(const struct rela *rela)
{
/* Don't over-ride need_dynrela flag if set, required for .toc sym*/
if (rela->need_dynrela || !rela->sym->sec)
return 1;
...
}
may_need_dynrela() and function_ptr_rela() operates on both rela and rela->sym->sec->rela. We might need to keep toc_rela() in both.
There was a problem hiding this comment.
Thanks for the explanation. Sorry, I keep forgetting that the R_PPC64_TOC16_HA relocation is for the original section, not the .toc. Can you add the above readelf snippet to a comment above the if (may_need_dynrela(rela)) line explaining why the toc_rela() is getting need_dynrela set? Otherwise I'll keep getting confused :-)
Can you also explain why you need to check rela->need_dynrela before setting toc_rela(rela)->need_dynrela? I'm trying to understand how it would be possible for a ".toc + xxx)" relocation to be converted to a dynrela.
Assuming we do really need that, how about doing the loop like this?
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (!rela->need_dynrela && may_need_dynrela(rela))
toc_rela(rela)->need_dynrela = 1;
}
Also a comment for the !strchr(toc_rela(rela)->sym->name, '.') check in may_need_dynrela() would also be helpful.
Another thing that would make it easier for me to follow would be to rename rela1 to rela_toc in function_ptr_rela().
Thanks!
There was a problem hiding this comment.
the if (may_need_dynrela(rela)) line explaining why the toc_rela() is getting need_dynrela set? Otherwise I'll keep getting confused :-)
Can you also explain why you need to check rela->need_dynrela before setting toc_rela(rela)->need_dynrela? I'm trying to understand how it would be possible for a ".toc + xxx)" relocation to be converted to a dynrela.
:) .rela.toc is superseded by .rela.text.* sections, those might have set the need_dynrela for entries in .rela.toc. Without the !rela->need_dynrela check, may_need_dynrela() for .rela.toc rela* will unset the need_dynrela flag, failing to rela->type check in function_ptr_rela().
.rela.toc rela type is always R_PPC64_ADDR64.
check in may_need_dynrela() would also be helpful.
function_ptr_rela(rela) will return 0, if the rela returned by toc_rela() is NULL. Duplicate check for NULL in may_need_dynrela(), can be avoided.
There was a problem hiding this comment.
Without the !rela->need_dynrela check, may_need_dynrela() for .rela.toc rela* will unset the need_dynrela flag, failing to rela->type check in function_ptr_rela().
Hm, but if it's already set, how would it get unset? To clarify, I'm talking about your rewritten version of the loop:
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
toc_rela(rela)->need_dynrela = 1;
}
There was a problem hiding this comment.
Sorry got little mixed up here, true it will not get unset. Having the check for need_dynrela,
if (!rela->need_dynrela && may_need_dynrela(rela)) will speed the handling of .rela.toc entries but not sure, if its worth the optimization. Arguably .rela.toc is only on PPC64le and an additional check for rela entries for non-toc sections might be an overkill. I will re-spin the patch with
list_for_each_entry(rela, &sec->relas, list) {
nr++; /* upper bound on number of kpatch relas and symbols */
if (may_need_dynrela(rela))
toc_rela(rela)->need_dynrela = 1;
}
Thanks.
kpatch-build/kpatch-elf.h
Outdated
| int addend; | ||
| int offset; | ||
| char *string; | ||
| int need_dynrela; |
kpatch-build/kpatch-elf.c
Outdated
| rela->type = GELF_R_TYPE(rela->rela.r_info); | ||
| rela->addend = rela->rela.r_addend; | ||
| rela->offset = rela->rela.r_offset; | ||
| rela->need_dynrela = 0; |
There was a problem hiding this comment.
No need for this explicit initialization, we already memset the rela struct to zero in ALLOC_LINK.
465ec27 to
f52035c
Compare
|
v4 Changes:
|
| * if exist, to check for nested functions instead of rela->sym. | ||
| * Referenced sym holds the function name and rela->name holds | ||
| * reference section name. | ||
| */ |
There was a problem hiding this comment.
I don't think this comment is really needed. What I really wanted was a comment describing why it's doing the strchr() call with a check for the '.' character in the symbol name.
There was a problem hiding this comment.
I don't think this comment is really needed. What I really wanted was a comment describing why it's doing the strchr() call with a check for the '.' character in the symbol name.
Sorry if I have got it wrong, last paragraph from Evgenii comments above function_ptr_rela() explains why symbol name strchr() is checked for '.', Should that paragraph be moved here?
There was a problem hiding this comment.
Ah. Yes, I believe that last paragraph belongs here.
…ctly
A kernel crash happened in __do_softirq() in very rare cases when the
binary patch created from mainline commit be82485fbcbb
("netlink: fix an use-after-free issue for nlk groups") was unloaded.
Investigation has shown that the kernel tried to execute an RCU
callback, deferred_put_nlk_sk(), defined in the patch module after the
module had been unloaded.
The callback was set by the patched variant of netlink_release() and
the address of the patched deferred_put_nlk_sk() was used, rather than
the address of the original function.
Similar problems occur with workqueue functions as well.
As suggested in
dynup#755 (comment),
create-diff-object was modified so that the addresses of the original
functions were used in such situations, at least for x86 systems. A
similar fix for PowerPC was added as well.
Changes in v4:
* '#ifdef __x86_64__' was removed. It is not actually needed right now
because the constants for relocation types are different on different
architectures.
Changes in v3:
* Minor refactoring and a comment explaining what this all is about.
Quite lengthy, but the dynrela-related code is really far from obvious.
Changes in v2:
* Handle the nested functions the same way as before, because they are
unlikely to be used as asynchronous callbacks. Example: cmp() in
bch_cache_show() from drivers/md/bcache/sysfs.c in the kernel 4.4.
As the nested functions are local to the functions they are defined in,
the compiler names them in a similar way to static locals: <name>.<number>.
Currently, we filter out all functions with '.' in their names. If there
are any asynchronous callbacks in the kernel that have a dot in their
names too, they could be handled in the future patches. It is unclear
though, if the callbacks with such names can appear in the kernel.
Signed-off-by: Evgenii Shatokhin <eshatokhin@virtuozzo.com>
It was observed by Evgenii Shatokhin in PR#755, that when the RCU callback was called on the patched function, from unloaded livepatch module triggered a kernel crash. This patch implements the approach on PowerPC outlined in PR#755. With -mcmodel=large, like any other data, function pointers are also loaded relative to the current TOC base and are populated as relocation entries in .toc section. Every function passing a function pointer as the argument need to load the function address through .toc section + offset. Convert such .toc + offset relocation into a dynamic rela, which resolves to original function address, during module load. Also move the comment related to nested function check, into may_need_dynrela(). Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Evgenii Shatokhin <eshatokhin@virtuozzo.com> Cc: Joe Lawrence <jdl1291@gmail.com> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
f52035c to
19b0aba
Compare
|
v5 Changes:
|
jpoimboe
left a comment
There was a problem hiding this comment.
Thanks, this looks good. kpatch_create_intermediate_sections() could use some more cleanups, but I'll propose those in a separate PR.
|
👍 @kamalesh-babulal , I went reading back through @euspectre 's PR #755 , and remembered that I had posted this integration patch and test: function-pointer.patch.txt This should be applicable to PowerPC implementation as well, right? If so, I can add this test in a follow-up PR. |
|
I'm going to be out next week. Here's my (rough, untested) WIP for refactoring the dynrela code in case anybody's interested: https://github.com/jpoimboe/kpatch/tree/dynrela . I'll try to pick it up again in April. |
@joe-lawrence Yes, It will test the functionality on PowerPC too. Please add this test case to the integration tests. I have been testing this PR with the kernel commit be82485fbcb mentioned in the first comment of PR #755. Can you please merge this PR, so I can rebase and push the patch for PR #793 and other refactoring code too.
@jpoimboe I will try and review/test the patch next week. |
|
@kamalesh-babulal - okay, sounds good |
It was observed by Evgenii Shatokhin in #755, that when the RCU callback was called on the patched function, from unloaded livepatch module triggered a kernel crash.
This patch implements the approach on Powerpc outlined in #755. With -mcmodel=large, like any other data, function pointers are also loaded relative to the current TOC base and are populated as relocation entries in .toc section. Every function call/passing such function as pointers needs to access them through .toc section + offset.
The simplest approach would be to convert the .toc + offset relocation into a dynamic rela referring to the original function address but the same function is also a patched function. Thus, on PowerPC we need to append a new relocation entry into .rela.toc, duplicating the entry used as a function pointer and converting the newly added rela as dynamic rela.
Also, modify the function to load the new entry by modifying the addend value to newly created dynamic rela.