[fix](clone) Fix clone and alter tablet use same tablet path#34889
[fix](clone) Fix clone and alter tablet use same tablet path#34889dataroaring merged 12 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
| &store, _clone_req.partition_id)); | ||
| auto tablet_dir = fmt::format("{}/{}/{}", local_shard_root_path, _clone_req.tablet_id, | ||
| _clone_req.schema_hash); | ||
| auto tablet_manager = _engine.tablet_manager(); |
There was a problem hiding this comment.
warning: 'auto tablet_manager' can be declared as 'auto *tablet_manager' [readability-qualified-auto]
| auto tablet_manager = _engine.tablet_manager(); | |
| auto *tablet_manager = _engine.tablet_manager(); |
|
run buildall |
|
TeamCity be ut coverage result: |
TPC-DS: Total hot run time: 172537 ms |
ClickBench: Total hot run time: 30.25 s |
|
run buildall |
|
PR approved by anyone and no changes requested. |
TPC-H: Total hot run time: 42083 ms |
|
TeamCity be ut coverage result: |
TPC-DS: Total hot run time: 168084 ms |
ClickBench: Total hot run time: 30.57 s |
|
clang-tidy review says "All clean, LGTM! 👍" |
|
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 39967 ms |
TPC-DS: Total hot run time: 172966 ms |
ClickBench: Total hot run time: 30.55 s |
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
|
PR approved by at least one committer and no changes requested. |
TPC-H: Total hot run time: 40312 ms |
TPC-DS: Total hot run time: 172807 ms |
|
TeamCity be ut coverage result: |
ClickBench: Total hot run time: 31.16 s |
The entire process is as follows: 1. Drop the tablet. 2. Successfully clone the tablet in full. 3. Start the incremental clone. 4. Start to move the tablet to the trash (the process of actually cleaning the data begins from step 1, where the tablet was dropped). 5. The incremental clone fails. 6. The incremental clone is successfully retried. Step 4 moved the data that was just pulled from the full clone to the trash, leading to data loss. The failure in step 5 of the incremental clone was also due to the deletion of the just-pulled snapshot data. Fix: When cloning, check the tablet status and determine if the tablet directory has already been moved to the trash directory. If it has not been moved to the trash, the clone thread should help move it to the trash directory.
…34889) The entire process is as follows: 1. Drop the tablet. 2. Successfully clone the tablet in full. 3. Start the incremental clone. 4. Start to move the tablet to the trash (the process of actually cleaning the data begins from step 1, where the tablet was dropped). 5. The incremental clone fails. 6. The incremental clone is successfully retried. Step 4 moved the data that was just pulled from the full clone to the trash, leading to data loss. The failure in step 5 of the incremental clone was also due to the deletion of the just-pulled snapshot data. Fix: When cloning, check the tablet status and determine if the tablet directory has already been moved to the trash directory. If it has not been moved to the trash, the clone thread should help move it to the trash directory.
…34889) The entire process is as follows: 1. Drop the tablet. 2. Successfully clone the tablet in full. 3. Start the incremental clone. 4. Start to move the tablet to the trash (the process of actually cleaning the data begins from step 1, where the tablet was dropped). 5. The incremental clone fails. 6. The incremental clone is successfully retried. Step 4 moved the data that was just pulled from the full clone to the trash, leading to data loss. The failure in step 5 of the incremental clone was also due to the deletion of the just-pulled snapshot data. Fix: When cloning, check the tablet status and determine if the tablet directory has already been moved to the trash directory. If it has not been moved to the trash, the clone thread should help move it to the trash directory.
…34889) The entire process is as follows: 1. Drop the tablet. 2. Successfully clone the tablet in full. 3. Start the incremental clone. 4. Start to move the tablet to the trash (the process of actually cleaning the data begins from step 1, where the tablet was dropped). 5. The incremental clone fails. 6. The incremental clone is successfully retried. Step 4 moved the data that was just pulled from the full clone to the trash, leading to data loss. The failure in step 5 of the incremental clone was also due to the deletion of the just-pulled snapshot data. Fix: When cloning, check the tablet status and determine if the tablet directory has already been moved to the trash directory. If it has not been moved to the trash, the clone thread should help move it to the trash directory.
Proposed changes
Issue Number: close #xxx
The entire process is as follows:
Step 4 moved the data that was just pulled from the full clone to the trash, leading to data loss.
The failure in step 5 of the incremental clone was also due to the deletion of the just-pulled snapshot data.
Fix:
When cloning, check the tablet status and determine if the tablet directory has already been moved to the trash directory. If it has not been moved to the trash, the clone thread should help move it to the trash directory.
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...