Optimize common path of File.dirname by byroot · Pull Request #15902 · ruby/ruby

byroot · 2026-01-18T19:19:30Z

Because it has to handle multibyte encodings, dirname scans the entire string from the start and maps all the separators before copying the relevant part.

The overwhelming majority of the time, the function is called with UTF-8 or ASCII strings, and level = 1.

If we optimize for that case specifically, we can use a much simpler and faster reverse search.

In addition, there is a lot needless and/or duplicated safety checks performed by FilePathStringValue and StringValueCStr, so a lot of overhead can be removed by calling a simpler check function instead.

Similarly, the function was using a lot of generic rb_enc_ functions that waste a lot of time performing type checks, when we know we're dealing with strings.


|       |compare-ruby|built-ruby|
|:------|-----------:|---------:|
|long   |      3.925M|   25.221M|
|       |           -|     6.43x|
|short  |     15.550M|   28.697M|
|       |           -|     1.85x|

Because it has to handle multibyte encodings, `dirname` scans the entire string from the start and maps all the separators before copying the relevant part. The overwhelming majority of the time, the function is called with UTF-8 or ASCII strings, and `level = 1`. If we optimize for that case specifically, we can use a much simpler and faster reverse search. In addition, there is a lot needless and/or duplicated safety checks performed by `FilePathStringValue` and `StringValueCStr`, so a lot of overhead can be removed by calling a simpler check function instead. Similarly, the function was using a lot of generic `rb_enc_` functions that waste a lot of time performing type checks, when we know we're dealing with strings. ``` | |compare-ruby|built-ruby| |:------|-----------:|---------:| |long | 3.925M| 25.221M| | | -| 6.43x| |short | 15.550M| 28.697M| | | -| 1.85x| ```

byroot · 2026-01-18T21:55:06Z

I'm gonna wait on this, because instead of and addtional fast path, we could perhaps always consider paths as ASCII compatible, and get most of the win with while reducing complexity: byroot@56aec70

As far as I can tell, File methods reject non-ASCIIcompatible paths:

>> File.dirname("/foo/bar/baz".encode(Encoding::UTF_16BE))
(irb):1:in 'File.dirname': path name must be ASCII-compatible (UTF-16BE): "/foo/bar/baz" (Encoding::CompatibilityError)

So I don't understand why we'd need all that multibyte handling code.

It was added in January 2012 in ed46983, but the encoding check was added later in October 2012 ad54de2, so perhaps it's just "dead code".

nobu · 2026-01-19T02:52:38Z

It is not dead code.
That Next macro is for ASCII-compatible & multibyte encodings.
Concretely, 0x5c in Shift_JIS family can be either "\" and a trailing byte.
As you know, "\" is a Windows path separator, so the pathtraverse needs to be aware of multibytes.

OTOH, you may be able to simplify the macro in the case FILE_ALT_SEPARATOR is not defined.

byroot · 2026-01-19T05:09:10Z

Concretely, 0x5c in Shift_JIS family can be either \ and a trailing byte.

That makes a lot of sense.

OTOH, you may be able to simplify the macro in the case FILE_ALT_SEPARATOR is not defined.

Interesting idea. Or perhaps pass encoding as NULL for the super common and safe encodings (ASCII, and UTF-8 basically).

byroot · 2026-01-19T08:15:46Z

Closing in favor of #15907

This comment has been minimized.

Sign in to view

byroot force-pushed the speedup-file-dirname branch from 6b6b35f to 5d8c3ac Compare January 18, 2026 19:56

byroot force-pushed the speedup-file-dirname branch from 5d8c3ac to 98d9072 Compare January 18, 2026 20:25

byroot mentioned this pull request Jan 19, 2026

Optimize File.dirname for common encodings #15907

Merged

byroot closed this Jan 19, 2026

byroot mentioned this pull request Jan 20, 2026

File.dirname: add a spec for Shift JIS handling ruby/spec#1330

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize common path of File.dirname#15902

Optimize common path of File.dirname#15902
byroot wants to merge 1 commit intoruby:masterfrom
byroot:speedup-file-dirname

byroot commented Jan 18, 2026

Uh oh!

This comment has been minimized.

byroot commented Jan 18, 2026

Uh oh!

nobu commented Jan 19, 2026 •

edited

Loading

Uh oh!

byroot commented Jan 19, 2026

Uh oh!

byroot commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

byroot commented Jan 18, 2026

Uh oh!

This comment has been minimized.

byroot commented Jan 18, 2026

Uh oh!

nobu commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

byroot commented Jan 19, 2026

Uh oh!

byroot commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nobu commented Jan 19, 2026 •

edited

Loading