Skip to content

Conversation

@xe-nvdk
Copy link
Contributor

@xe-nvdk xe-nvdk commented Feb 1, 2026

Hey team, new version released, new benchmark results, I just did c6a.4xlarge, I understand that after verification, you will run in the others sizes? if not, let me know and I can update the other sizes too.

@rschu1ze rschu1ze self-assigned this Feb 2, 2026
@xe-nvdk
Copy link
Contributor Author

xe-nvdk commented Feb 2, 2026

I was wondering. How the automate benchmark on your side is performed? is right away the instance is created and accessible?

I'm asking this, because I noted that when the instance is ready to access, and the disk is not initialized completely, I noticed more latency, and not only for Arc, I can imagine that apply to all the systems. In my case always wait like a minute or two to make sure that the EBS volume created report initialized completed.

https://docs.aws.amazon.com/ebs/latest/userguide/initalize-volume.html

When you create an Amazon EBS volume, either from an EBS snapshot or from another EBS volume (volume copy), the data blocks must be written to the volume before you can access them. For volumes created from snapshots, the data blocks must be downloaded from Amazon S3 to the new volume. For volume copies, the data blocks must be copied from the source volume to the volume copy. This process is called volume initialization. During this time, the volume being initialized might experience increased I/O latency and decreased performance. Full volume performance is achieved only once all storage blocks have been downloaded and written to the volume.

@rschu1ze
Copy link
Member

rschu1ze commented Feb 2, 2026

These two scripts implement the automated runs:

is right away the instance is created and accessible?

I would be very surprised if the EBS volume is not available right away.

@xe-nvdk
Copy link
Contributor Author

xe-nvdk commented Feb 2, 2026

These two scripts implement the automated runs:

is right away the instance is created and accessible?

I would be very surprised if the EBS volume is not available right away.

Is available, but there's initilization that occurs in the background, and what I found is when you clone the repo, and run the benchmark right away and that initialization is not completed, the values are worst.

Remember that Ubuntu AMI, use an snapshot, and how I quoted the AWS page, is confirming that during the initialization, the performance is degraded.

During this time, the volume being initialized might experience increased I/O latency and decreased performance. Full volume performance is achieved only once all storage blocks have been downloaded and written to the volume.

@rschu1ze rschu1ze merged commit 0c7276e into ClickHouse:main Feb 2, 2026
@xe-nvdk
Copy link
Contributor Author

xe-nvdk commented Feb 2, 2026

@rschu1ze When you have a moment, please let me know if I should provide the results for the others sizes or is managed from your end? Thank you.

@rschu1ze rschu1ze mentioned this pull request Feb 2, 2026
@rschu1ze
Copy link
Member

rschu1ze commented Feb 2, 2026

@rschu1ze When you have a moment, please let me know if I should provide the results for the others sizes or is managed from your end? Thank you.

--> #776

About the volume initialization issue: I'm no real expert in AWS, would you please open a issue? I'm particularly interested in scripting that could proof there's an issue.

@xe-nvdk
Copy link
Contributor Author

xe-nvdk commented Feb 2, 2026

@rschu1ze When you have a moment, please let me know if I should provide the results for the others sizes or is managed from your end? Thank you.

--> #776

About the volume initialization issue: I'm no real expert in AWS, would you please open a issue? I'm particularly interested in scripting that could proof there's an issue.

I will do. Thank you.

@nwoolmer
Copy link

nwoolmer commented Feb 3, 2026

@rschu1ze https://docs.aws.amazon.com/ebs/latest/userguide/ebs-fast-snapshot-restore.html

If its a fresh gp2 drive, there shouldn't be an issue.

But if you are creating the volume from snapshot that already has the files on it, data blocks are lazily and transparently pulled from S3. This increases first-touch latency. You can avoid it by using fast restores or touching the volume with fio.

@rschu1ze
Copy link
Member

rschu1ze commented Feb 3, 2026

Interesting stuff. I read what

--block-device-mappings 'DeviceName=/dev/sda1,Ebs={DeleteOnTermination=true,VolumeSize=500,VolumeType=gp2}' \

in this does and it indeed creates a new gp2 drive. I think we are good. Thanks for the input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants