Improve the performance when using enumeration#8395
Improve the performance when using enumeration#8395aplopez wants to merge 6 commits intoSSSD:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request effectively improves performance by optimizing logging, removing an unused function, and correcting a condition related to enumeration. The changes are well-aligned with the stated goals of enhancing enumeration performance, especially for large user bases. The addition of a new test case for the general enumeration scenario ensures that the modified logic is adequately covered.
|
Mistype in the commit message: "We must look into de TS cache" |
Fixed. |
|
I think fix is correct in the sense it fixes a bug. But I think logic of In particular, if |
| dn_filter, &ts_cache_res); | ||
| if (ret != EOK && ret != ENOENT) { | ||
| goto done; | ||
| } |
There was a problem hiding this comment.
Can this go out immediately from else branch?
There was a problem hiding this comment.
if (ts_res.count > 0) {} else {go-out}
There was a problem hiding this comment.
I understand we cannot go to out because we still need to proceed with the code that follows the big if {} block.
There was a problem hiding this comment.
Why?
- search is by name
- timestamp cache has nothing for this name pattern
Imo, this means main cache also has nothing. No?
Could you provide an example where ts-search finds nothing but main cache finds something?
There was a problem hiding this comment.
It is not simply ignored. It is ignored if no entry is found, but if entries are found, it is used (and only those entries are returned).
There was a problem hiding this comment.
In addition, one of the test falls in this case and fails if I add the goto-out as you proposed.
Maybe there is an error in the test setup. I need to confirm this.
There was a problem hiding this comment.
It is not simply ignored. It is ignored if no entry is found, but if entries are found, it is used (and only those entries are returned).
Do I understand correctly it returns outdated entries in this case, despite filter clearly set?
There was a problem hiding this comment.
Btw, other than this issue discussed in this thread, patch set looks good to me.
There was a problem hiding this comment.
It is not simply ignored. It is ignored if no entry is found, but if entries are found, it is used (and only those entries are returned).
Do I understand correctly it returns outdated entries in this case, despite filter clearly set?
Yes. That's what happens. But this happens in known cases. When you call FindByAttr() it will return only new entries. New calls will return newer entries only until there is none, in which case, all the entries will be returned.
Maybe @sumit-bose can explain why?
Function sysdb_enumpwent() is not used. It was replaced by sysdb_enumpwent_filter().
When there are too many users (17,000+) this message can be too long. Limit it to the first 50 characters. Resolves: SSSD#6951
We must look into the TS cache only when a name is provided. Using the TS cache on an unfiltered enumeration is useless. Resolves: SSSD#6951
Added a case that was not checked before. It is the case when `attr`, `attr_name` and `addtl_filter` are all `NULL`.
Create the filter to retrieve only the requested entries. Do not create a new filter and search for matches if there is no results from the previous search. The called functions handle this case correctly but why waisting time calling them?
Function cache_req_user_by_filter_lookup() will set or not the recent filter depending on whether data->name.attr is set or not. As mentioned in the comment, it should be done base on whether the refernced attribute is name or not.
|
Note: Covsan is green so far. |
|
Hm, Looks relevant, but why f44 only... race condition? |
I reran the tests and a different test failed. 😮💨 |
|
And now all the tests passed. There is some instability in F44, but not related to this PR. |
It is very suspicious that it was |
I thought the same until I noticed this test failed once and never again. The second time a completely different test failed. The third time, the latest, none. |
This PR includes:
Enumeration, specially when there are 15,000+ users, is slow. This fix helps, but it doesn't work miracles.
In my test environment, the enumeration went from 8 minutes to about 1.
It is important to know that, with such an amount of users, many operations time out. It is necessary to increment the
timeoutin[nss]and for the domain, but also set large values forldap_enumeration_refresh_timeoutandldap_search_timeoutin the domain. I used these values to avoid any timeout (YMMV):