Skip to content

Improve word-space detection in getTextContent#21432

Open
calixteman wants to merge 1 commit into
mozilla:masterfrom
calixteman:fix_spaces
Open

Improve word-space detection in getTextContent#21432
calixteman wants to merge 1 commit into
mozilla:masterfrom
calixteman:fix_spaces

Conversation

@calixteman

Copy link
Copy Markdown
Contributor

The gap between two glyphs was turned into a space using a hardcoded trackingSpaceMin of 0.102 * fontSize. Depending on the font this was either too large or too small, producing fake spaces inside words ("Robe rt") or a space between every letter of letter-spaced text ("R E A S O N S").

The threshold is now derived from the font's space width (or the glyph's advance width for Type3 fonts, where the font size is meaningless) and raised for uniformly letter-spaced runs. A space wrongly inserted before the first glyph of such a run is retracted once the next gap confirms it, and the adaptive state is reset on spacing or text-matrix changes so it can't leak across sections.

It fixes #18768 and #16752.

@calixteman calixteman requested a review from timvandermeij June 11, 2026 20:27
@codecov-commenter

codecov-commenter commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.54545% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.71%. Comparing base (e75a7cf) to head (526e44e).
⚠️ Report is 20 commits behind head on master.

Files with missing lines Patch % Lines
src/core/evaluator.js 92.94% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #21432      +/-   ##
==========================================
- Coverage   89.76%   89.71%   -0.06%     
==========================================
  Files         260      260              
  Lines       66032    66100      +68     
==========================================
+ Hits        59274    59300      +26     
- Misses       6758     6800      +42     
Flag Coverage Δ
browsertest 66.64% <84.54%> (-0.21%) ⬇️
fonttest 9.02% <0.00%> (-0.01%) ⬇️
integrationtest 68.90% <74.54%> (+2.06%) ⬆️
unittest 57.26% <88.18%> (+0.08%) ⬆️
unittestcli 56.24% <88.18%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@calixteman

Copy link
Copy Markdown
Contributor Author

/botio browsertest

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_browsertest from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/faf7e21cfd9433b/output.txt

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Windows)


Received

Command cmd_browsertest from @calixteman received. Current queue size: 1

Live output at: http://54.193.163.58:8877/c4ed1bd477dcd9f/output.txt

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/faf7e21cfd9433b/output.txt

Total script time: 18.64 mins

  • Regression tests: FAILED
  different ref/snapshot: 7

Image differences available at: http://54.241.84.105:8877/faf7e21cfd9433b/reftest-analyzer.html#web=eq.log

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/c4ed1bd477dcd9f/output.txt

Total script time: 46.58 mins

  • Regression tests: FAILED
  different ref/snapshot: 7

Image differences available at: http://54.193.163.58:8877/c4ed1bd477dcd9f/reftest-analyzer.html#web=eq.log

Comment thread src/core/fonts.js Outdated

@timvandermeij timvandermeij left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me, with the comment above addressed and passing integration/browser tests. Looks good!

@calixteman

Copy link
Copy Markdown
Contributor Author

/botio browsertest

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_browsertest from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/32efafc3ddbba5e/output.txt

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Windows)


Received

Command cmd_browsertest from @calixteman received. Current queue size: 2

Live output at: http://54.193.163.58:8877/ce9912cab13b025/output.txt

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/32efafc3ddbba5e/output.txt

Total script time: 18.27 mins

  • Regression tests: FAILED
  different ref/snapshot: 5

Image differences available at: http://54.241.84.105:8877/32efafc3ddbba5e/reftest-analyzer.html#web=eq.log

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/ce9912cab13b025/output.txt

Total script time: 0.59 mins

@calixteman

Copy link
Copy Markdown
Contributor Author

/botio browsertest

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_browsertest from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/9a00c9dc6cca79d/output.txt

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Windows)


Received

Command cmd_browsertest from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/f923eb1aab2efab/output.txt

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/9a00c9dc6cca79d/output.txt

Total script time: 18.26 mins

  • Regression tests: FAILED
  different ref/snapshot: 5

Image differences available at: http://54.241.84.105:8877/9a00c9dc6cca79d/reftest-analyzer.html#web=eq.log

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/f923eb1aab2efab/output.txt

Total script time: 46.63 mins

  • Regression tests: FAILED
  different ref/snapshot: 5

Image differences available at: http://54.193.163.58:8877/f923eb1aab2efab/reftest-analyzer.html#web=eq.log

The gap between two glyphs was turned into a space using a hardcoded
`trackingSpaceMin` of `0.102 * fontSize`. Depending on the font this was
either too large or too small, producing fake spaces inside words
("Robe rt") or a space between every letter of letter-spaced text
("R E A S O N S").

The threshold is now derived from the font's space width (or the glyph's
advance width for Type3 fonts, where the font size is meaningless) and
raised for uniformly letter-spaced runs. A space wrongly inserted before
the first glyph of such a run is retracted once the next gap confirms it,
and the adaptive state is reset on spacing or text-matrix changes so it
can't leak across sections.

It fixes mozilla#18768 and mozilla#16752.
@calixteman

Copy link
Copy Markdown
Contributor Author

/botio browsertest

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Windows)


Received

Command cmd_browsertest from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/565979537063732/output.txt

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_browsertest from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/e1b3d9e78915361/output.txt

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/e1b3d9e78915361/output.txt

Total script time: 21.41 mins

  • Regression tests: FAILED
  errors: 6
  different ref/snapshot: 5

Image differences available at: http://54.241.84.105:8877/e1b3d9e78915361/reftest-analyzer.html#web=eq.log

@moz-tools-bot

Copy link
Copy Markdown
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/565979537063732/output.txt

Total script time: 46.86 mins

  • Regression tests: FAILED
  different ref/snapshot: 5

Image differences available at: http://54.193.163.58:8877/565979537063732/reftest-analyzer.html#web=eq.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Hardcoded textContentItem.trackingSpaceMin causes incorrect fake spaces

5 participants