HBASE-29668 Add row cache framework#7398
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
f95570c to
cc96b4c
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
I see this PR got an unrelated commit. Do you want me to rebase this branch on top of master, @EungsopYoo? You can then just force push your initial commit to your remote branch. |
6136446 to
0df6d15
Compare
|
@wchevreuil |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
c19a96d to
25cea27
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
ea0e73a to
0df6d15
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@wchevreuil |
wchevreuil
left a comment
There was a problem hiding this comment.
Thanks for submitting this PR @EungsopYoo and sorry for the long delayed response. Let's resume the discussions on this. Please refer to my comments below:
I think we should avoid putting row cache related logic on RSRpcServices, as RSRpcServices is mainly a proxy between RS/Region logic and the RPC layer.
The row cache would be a core component of RegionServer, so in that sense, it should be created and kept at HRegionServer, just like we do for BlockCache.
Since row cache would be used on region related operations, all access logic to it should be implemented at HRegion class. HRegionServer should expose its row cache to HRegion instances via the RegionServerServices (as it does with BlockCache, for instance).
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RowCacheService.java
Outdated
Show resolved
Hide resolved
| package org.apache.hadoop.hbase.regionserver; | ||
|
|
||
| @org.apache.yetus.audience.InterfaceAudience.Private | ||
| public class RowCacheImpl implements RowCache { |
There was a problem hiding this comment.
nit: since we are not really doing anything here, can we remove this impl from this PR and focus only on the framework itself?
|
@wchevreuil |
No problem. Enjoy your time off! |
|
I have a lot of backlog, so I think I’ll be able to resume in about one to two weeks. |
0df6d15 to
193d215
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
193d215 to
927b2e6
Compare
|
@wchevreuil Should I open a separate PR for the update, or could you take care of it directly? |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
abb84a0 to
646ca7c
Compare
fe95513 to
74575e4
Compare
|
|
||
| RegionScannerImpl getScannerWithResults(Get get, Scan scan, List<Cell> results) | ||
| throws IOException { | ||
| return getRowCacheService().getScanner(this, get, scan, results); |
There was a problem hiding this comment.
We'll always go for the row cache? What if it's disabled?
There was a problem hiding this comment.
This is determined in RowCacheService.getScanner(). If the row cache is disabled, the system falls back to the traditional read path.
| RegionScannerImpl getScanner(HRegion region, Get get, Scan scan, List<Cell> results) | ||
| throws IOException { | ||
| if (!canCacheRow(get, region)) { | ||
| return getScannerInternal(region, scan, results); | ||
| } | ||
|
|
||
| RowCacheKey key = new RowCacheKey(region, get.getRow()); | ||
|
|
||
| // Try get from row cache | ||
| if (tryGetFromCache(region, key, get, results)) { | ||
| // Cache is hit, and then no scanner is created | ||
| return null; | ||
| } | ||
|
|
||
| RegionScannerImpl scanner = getScannerInternal(region, scan, results); | ||
| populateCache(region, results, key); | ||
| return scanner; | ||
| } |
There was a problem hiding this comment.
I think RowCacheService shouldn't bother about RegionScannerImpl, but rather only the cache itself, so it should only check for the rowkey in the cache and populate the results accordingly. It's then up for the HRegion to create the scanner when no results are found in the row cache. So we could make this method void, or return List with the found results, and get rid of the scanner creation logic. These would be done in HRegion.getScannerResults.
| RegionScannerImpl scanner = region.getScanner(scan); | ||
| scanner.next(results); | ||
| return scanner; |
There was a problem hiding this comment.
Should move to HRegion (see my previous comment above).
| return getScannerInternal(region, scan, results); | ||
| } | ||
|
|
||
| RowCacheKey key = new RowCacheKey(region, get.getRow()); |
There was a problem hiding this comment.
Could we use "namespace + tableame + rowkey" as the UID of a row in the row cache? We could then only pass the TableDescriptor, rather than the HRegion as this method parameter.
There was a problem hiding this comment.
If we want to support evicting the row cache when a region is closed, similar to how block cache eviction works, we need to include the region ID as part of the UID. Otherwise, it is impossible to determine which row cache entries are associated with the region being closed.
There was a problem hiding this comment.
That's a good point. Let's keep it as it is now (region name + row key).
699fb0a to
0bf3ccd
Compare
wchevreuil
left a comment
There was a problem hiding this comment.
LGTM, +1, just had some final nit comments.
| this.isRowCacheEnabled = determineRowCacheEnabled(); | ||
| } | ||
|
|
||
| boolean determineRowCacheEnabled() { |
There was a problem hiding this comment.
nit: call it "checkRowCacheConfig()" and make it private.
| return false; | ||
| } | ||
|
|
||
| boolean tryGetFromCache(HRegion region, RowCacheKey key, Get get, List<Cell> results) { |
There was a problem hiding this comment.
nit: no need for the region param anymore.
| return true; | ||
| } | ||
|
|
||
| void populateCache(HRegion region, List<Cell> results, RowCacheKey key) { |
There was a problem hiding this comment.
nit: no need for the region param anymore.
No description provided.