Utf 8 encoding issue by ChallaHalla · Pull Request #3583 · Shopify/ruby-lsp

ChallaHalla · 2025-06-04T19:32:59Z

Motivation

Closes: #3494
Closes: #2446

This PR addresses an issue where UTF-8 encoded text uses incorrect offsets within the language server. The language server was calculating offsets using the number of code points, but UTF-8 code units match the number of bytes (as explained in the spec).

Implementation

To fix this issue, the Scanner class was updated to calculate offset by taking a byte slice of the current text up until a given position, and return the number of chars within the given slice. This number was then added to the offset provided by the new character, which results in a correct char offset.

Automated Tests

Added some unit tests.

Manual Tests

Tested in neo-vim.

calculations. Co-authored-by: Vinicius Stock <vinistock@users.noreply.github.com> Co-authored-by: Vinicius Stock <vinistock@users.noreply.github.com>

…2938)" This reverts commit 0d421a6. Co-authored-by: Vinicius Stock <vinistock@users.noreply.github.com>

graphite-app · 2025-06-04T19:33:07Z

How to use the Graphite Merge Queue

Add the label graphite-merge to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

vinistock · 2025-06-04T20:24:30Z

Utf 8 encoding issue #3583 👈 (View in Graphite)
main

How to use the Graphite Merge Queue

Add the label graphite-merge to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

### Motivation With the improvements to the position scanners (#3612 and #3583), we're now seeing some cases of invalid location being raised, despite no reports of any issues regarding the state of the documents. I think we're making a mistake by trying to apply document operations with higher priority than feature requests. For example, if we receive a request for completion and the user immediately edits the file, the server might process the edit before finishing the completion request and there are no guarantees that the new state of the document can satisfy that original request. I want to propose pushing text synchronization operations to the queue, so that they are processed in order and without locking, which I believe will improve the situation. ### Implementation I start pushing text synchronization operations to the work queue and removed many of the mutex locks that we originally had. Since requests are processed in order, there should not be a chance of a feature request being processed with an incorrect document state.

Remove excessive mutex synchronization (#3637) ### Motivation With the improvements to the position scanners (#3612 and #3583), we're now seeing some cases of invalid location being raised, despite no reports of any issues regarding the state of the documents. I think we're making a mistake by trying to apply document operations with higher priority than feature requests. For example, if we receive a request for completion and the user immediately edits the file, the server might process the edit before finishing the completion request and there are no guarantees that the new state of the document can satisfy that original request. I want to propose pushing text synchronization operations to the queue, so that they are processed in order and without locking, which I believe will improve the situation. ### Implementation I start pushing text synchronization operations to the work queue and removed many of the mutex locks that we originally had. Since requests are processed in order, there should not be a chance of a feature request being processed with an incorrect document state. Co-authored-by: vinistock <18742907+vinistock@users.noreply.github.com>

ChallaHalla and others added 2 commits June 4, 2025 15:21

Fixing a bug where UTF-8 encodings resulted in incorrect char offset

bad03fb

calculations. Co-authored-by: Vinicius Stock <vinistock@users.noreply.github.com> Co-authored-by: Vinicius Stock <vinistock@users.noreply.github.com>

Revert "Fail requests that are searching for a non existing position (#…

415d384

…2938)" This reverts commit 0d421a6. Co-authored-by: Vinicius Stock <vinistock@users.noreply.github.com>

ChallaHalla requested a review from a team as a code owner June 4, 2025 19:32

ChallaHalla added bugfix This PR will fix an existing bug server This pull request should be included in the server gem's release notes labels Jun 4, 2025

vinistock approved these changes Jun 4, 2025

View reviewed changes

Calculate character length based on encountered bytes

e4a012f

vinistock force-pushed the utf-8-encoding-issue branch from ff2271a to e4a012f Compare June 4, 2025 20:27

vinistock enabled auto-merge (squash) June 4, 2025 20:29

vinistock merged commit af30123 into main Jun 4, 2025
36 checks passed

vinistock deleted the utf-8-encoding-issue branch June 4, 2025 20:51

BrewTestBot mentioned this pull request Jun 6, 2025

ruby-lsp 0.24.0 Homebrew/homebrew-core#226000

Merged

vinistock mentioned this pull request Jun 17, 2025

Advance scanner position by byte length while searching for line #3612

Merged

vinistock mentioned this pull request Jun 30, 2025

Remove excessive mutex synchronization #3637

Merged

r7kamura mentioned this pull request Aug 23, 2025

Fix LSP Position handling for multi-byte characters throughout the project r7kamura/vscode-ruby-light#45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Utf 8 encoding issue#3583

Utf 8 encoding issue#3583
vinistock merged 3 commits intomainfrom
utf-8-encoding-issue

ChallaHalla commented Jun 4, 2025

Uh oh!

graphite-app bot commented Jun 4, 2025

Uh oh!

vinistock commented Jun 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ChallaHalla commented Jun 4, 2025

Motivation

Implementation

Automated Tests

Manual Tests

Uh oh!

graphite-app bot commented Jun 4, 2025

How to use the Graphite Merge Queue

Uh oh!

vinistock commented Jun 4, 2025

How to use the Graphite Merge Queue

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants