[feature](cloud) Add lazy commit mechanism for commit_txn#38243
[feature](cloud) Add lazy commit mechanism for commit_txn#38243gavinchou merged 5 commits intoapache:masterfrom
commit_txn#38243Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
run buildall |
0cac9c5 to
d47f33a
Compare
|
run buildall |
d47f33a to
a7e2b1b
Compare
|
run buildall |
1 similar comment
|
run buildall |
f70c9f4 to
d7c4b5c
Compare
|
run buildall |
d7c4b5c to
bf6aad6
Compare
|
run buildall |
bf6aad6 to
9511fed
Compare
|
run buildall |
1 similar comment
|
run buildall |
| } | ||
|
|
||
| err = txn->commit(); | ||
| if (err != TxnErrorCode::TXN_OK) { |
There was a problem hiding this comment.
There may be conflicts caused by concurrent commits. According to our design expectations, TXN_CONFLICT also needs to be handled. One simple solution is to consider the data already converted if TXN_CONFLICT occurs (this requires that each task uses the same split batch method). Another solution is to retry once if TXN_CONFLICT occurs.
It's essential to manage the competition between transactions properly here.
9511fed to
8ed4422
Compare
|
run buildall |
8ed4422 to
ae48134
Compare
|
run buildall |
ae48134 to
f23de13
Compare
|
run buildall |
f23de13 to
c0a6c5a
Compare
TPC-DS: Total hot run time: 191755 ms |
ClickBench: Total hot run time: 31.14 s |
|
run buildall |
| std::unique_ptr<Transaction>& txn, MetaServiceCode& code, | ||
| std::string& msg); | ||
|
|
||
| void convert_tmp_rowsets( |
There was a problem hiding this comment.
warning: function 'convert_tmp_rowsets' exceeds recommended size/complexity thresholds [readability-function-size]
void convert_tmp_rowsets(
^Additional context
cloud/src/meta-service/txn_lazy_committer.cpp:40: 154 lines including whitespace and comments (threshold 80)
void convert_tmp_rowsets(
^| } | ||
| } | ||
|
|
||
| void make_committed_txn_visible(const std::string& instance_id, int64_t db_id, int64_t txn_id, |
There was a problem hiding this comment.
warning: function 'make_committed_txn_visible' exceeds recommended size/complexity thresholds [readability-function-size]
void make_committed_txn_visible(const std::string& instance_id, int64_t db_id, int64_t txn_id,
^Additional context
cloud/src/meta-service/txn_lazy_committer.cpp:200: 82 lines including whitespace and comments (threshold 80)
void make_committed_txn_visible(const std::string& instance_id, int64_t db_id, int64_t txn_id,
^| DCHECK(txn_id > 0); | ||
| } | ||
|
|
||
| void TxnLazyCommitTask::commit() { |
There was a problem hiding this comment.
warning: function 'commit' exceeds recommended size/complexity thresholds [readability-function-size]
void TxnLazyCommitTask::commit() {
^Additional context
cloud/src/meta-service/txn_lazy_committer.cpp:296: 138 lines including whitespace and comments (threshold 80)
void TxnLazyCommitTask::commit() {
^
TPC-H: Total hot run time: 38230 ms |
TPC-DS: Total hot run time: 192287 ms |
ClickBench: Total hot run time: 30.99 s |
|
run buildall |
| DCHECK(txn_id > 0); | ||
| } | ||
|
|
||
| void TxnLazyCommitTask::commit() { |
There was a problem hiding this comment.
warning: function 'commit' exceeds recommended size/complexity thresholds [readability-function-size]
void TxnLazyCommitTask::commit() {
^Additional context
cloud/src/meta-service/txn_lazy_committer.cpp:296: 139 lines including whitespace and comments (threshold 80)
void TxnLazyCommitTask::commit() {
^|
run buildall |
TPC-H: Total hot run time: 38261 ms |
TPC-DS: Total hot run time: 195283 ms |
ClickBench: Total hot run time: 31.9 s |
Motivation:
* In cloud mode we use `foundationdb` for storing rowset meta, when we load data with many rowsets,
`commit_txn` will failed because `foundationdb` currently limits all transactions to be below 10 MB in size,
So we spilt `commit_txn` into several sub fdb txn to solve the problem
How:
`commit_txn` flow like this
sub txn 1:
1. update partition `VersionPB` with txn_id
2. submit async task to txn_lazy_committer
txn_lazy_committer:
sub txn 2:
convert tmp rowset meta per batch
sub txn 3:
convert tmp rowset meta per batch
......
sub txn n:
make txn visible and remove txn_id in `VersionPB`
`get_version` main flow like this:
1. `VersionPB` without txn_id, the same as before
2. `VersionPB` has txn_id field, it means this is a txn
need be advanced by txn_lazy_committer, so submit a
`txn_lazy_task` to make last txn visible and retry
|
run buildall |
TPC-H: Total hot run time: 38018 ms |
TPC-DS: Total hot run time: 192851 ms |
ClickBench: Total hot run time: 32.53 s |
|
PR approved by at least one committer and no changes requested. |
1 similar comment
|
PR approved by at least one committer and no changes requested. |
Motivation:
foundationdbfor storing rowset meta, when we load data with many rowsets,commit_txnwill failed becausefoundationdbcurrently limits all transactions to be below 10 MB in size, So we spiltcommit_txninto several sub fdb txn to solve the problemHow:
Now
commit_txnmain flow like thissub txn 1:
VersionPBwith txn_idtxn_lazy_committer:
sub txn 2:
convert tmp rowset meta per batch
sub txn 3:
convert tmp rowset meta per batch
......
sub txn n:
make txn visible and remove txn_id in
VersionPB