Skip to content

Optimize IPA computation with caching and batch train type queries#1434

Merged
TinyKitten merged 5 commits into
devfrom
claude/fix-ipa-performance-03fEK
Mar 20, 2026
Merged

Optimize IPA computation with caching and batch train type queries#1434
TinyKitten merged 5 commits into
devfrom
claude/fix-ipa-performance-03fEK

Conversation

@TinyKitten

@TinyKitten TinyKitten commented Mar 20, 2026

Copy link
Copy Markdown
Member

Summary

This PR improves performance by introducing IPA computation caching and optimizing train type lookups through batch queries instead of individual database calls.

Key Changes

  • IPA Computation Caching: Added compute_ipa_cached() and compute_line_ipa_cached() functions with thread-safe memoization using LazyLock and RwLock. These functions consolidate redundant IPA computations (katakana-to-IPA, roman-to-IPA, and TTS segments) into a single pass, eliminating double-computation of station_name_to_tts_segments().

  • Batch Train Type Queries: Introduced find_by_line_group_id_and_line_id_vec() method to the TrainTypeRepository trait that accepts multiple (line_group_id, line_id) pairs and returns results in a HashMap. This replaces individual database queries in the query interactor with a single batch query.

  • Query Optimization: Modified the station query interactor to collect all (line_group_cd, line_cd) pairs upfront and perform a single batch lookup instead of querying the database for each line-train type combination.

  • DTO Simplification: Updated Station, Line, and TrainType DTO conversions to use the new cached IPA functions, reducing code duplication and improving maintainability.

  • Database Query Enhancement: Implemented get_by_line_group_id_and_line_id_vec() in the PostgreSQL repository using UNNEST to efficiently join multiple pairs in a single query with DISTINCT ON to handle duplicates.

Notable Implementation Details

  • The IPA cache uses separate caches for stations/train-types and lines to handle their different computation logic (line names have suffix replacement).
  • The batch train type query uses PostgreSQL's UNNEST function to unnest two parallel arrays of IDs and join them efficiently.
  • Cache keys are tuples of (katakana_name, optional_roman_name) to ensure proper memoization across different name combinations.
  • Test mocks were updated to implement the new repository method.

https://claude.ai/code/session_01C529f2ARzy4nrGEak9kgE4

Summary by CodeRabbit

リリースノート

  • パフォーマンス改善

    • 駅名・路線名・列車種別の発音(IPA)と読み上げセグメントをキャッシュして応答を高速化しました。
    • 複数の列車種別情報をまとめて取得する一括取得処理を追加し、検索応答速度を向上しました。
  • Refactor

    • 発音・TTS関連の計算を単一の統一処理にまとめ、保守性を向上しました。
  • Tests

    • 一括取得対応のモック/テストを更新しました。

claude added 2 commits March 20, 2026 08:23
- IPA変換の二重計算を排除する統合関数(compute_ipa/compute_line_ipa)を追加
- LazyLock+RwLock<HashMap>によるプロセスグローバルなメモ化キャッシュを導入
  同じ(name_katakana, name_roman)に対するIPA計算を1回に削減
- Station/Line/TrainType各DTOの3つの個別IPA呼び出しをキャッシュ付き統合関数に置換
- query.rs内のルート検索でのインラインIPA呼び出しも同様に置換
- train_type取得のN+1クエリをUNNESTベースのバッチクエリに置換

https://claude.ai/code/session_01C529f2ARzy4nrGEak9kgE4
@TinyKitten TinyKitten self-assigned this Mar 20, 2026
@coderabbitai

coderabbitai Bot commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 54607ee6-f1f9-4f5e-83be-7569899886eb

📥 Commits

Reviewing files that changed from the base of the PR and between 74536f6 and b8a1401.

📒 Files selected for processing (1)
  • stationapi/src/use_case/interactor/query.rs
✅ Files skipped from review due to trivial changes (1)
  • stationapi/src/use_case/interactor/query.rs

📝 Walkthrough

Walkthrough

IpaResultによるIPA/TTS出力の単一化と駅/路線向けグローバルキャッシュ導入、および列車種別リポジトリに複数(line_group_id, line_id)ペアを一括取得するバッチAPI追加。DTOやインタラクターはこれらの新APIへ切替済み。

変更内容

コホート / ファイル(s) 概要
IPA計算キャッシング
stationapi/src/domain/ipa.rs
IpaResult構造体(name_ipaname_roman_ipatts_segments)を追加。STATION_IPA_CACHE/LINE_IPA_CACHE、内部計算(compute_ipa/compute_line_ipa)、共有キャッシュロジックcached_lookup、公開API compute_ipa_cached / compute_line_ipa_cached を実装。
リポジトリバッチ取得
stationapi/src/domain/repository/train_type_repository.rs, stationapi/src/infrastructure/train_type_repository.rs
TrainTypeRepositoryトレイトに find_by_line_group_id_and_line_id_vec を追加。インフラ実装で空入力短絡、単一SQL(UNNEST)で複数ペア取得し HashMap<(u32,u32), TrainType> を返す実装を導入。
DTO変換の統一
stationapi/src/use_case/dto/station.rs, stationapi/src/use_case/dto/train_type.rs, stationapi/src/use_case/dto/line.rs
既存の個別IPA/TTS呼び出しを廃止し、compute_ipa_cached / compute_line_ipa_cached の単一呼び出しで name_ipaname_roman_ipatts_segments を取得するように変更。不要インポートを削除。
インタラクター最適化
stationapi/src/use_case/interactor/query.rs
get_train_types_by_station_id を個別クエリから find_by_line_group_id_and_line_id_vec を使うバッチ取得へ変更。get_routes_minimal の駅IPA計算を compute_ipa_cached に統合。テストモックをバッチAPI対応へ更新。

シーケンス図

sequenceDiagram
    participant Client as クライアント
    participant Cache as IPAキャッシュ
    participant Compute as IPA計算
    participant DTO as DTO変換

    Client->>Cache: compute_ipa_cached(katakana, roman)
    Cache->>Cache: 読み取りロックで検索
    alt キャッシュヒット
        Cache-->>Client: IpaResult を返却
    else キャッシュミス
        Cache->>Compute: compute_ipa / compute_line_ipa を呼出し
        Compute->>Compute: name_ipa, name_roman_ipa, tts_segments を生成
        Compute-->>Cache: 計算結果を返却
        Cache->>Cache: 書き込みロックで挿入
        Cache-->>Client: IpaResult を返却
    end
    Client->>DTO: IpaResult からフィールドを抽出してマッピング
    DTO-->>Client: Grpc メッセージ返却
Loading

推定コードレビュー工数

🎯 4 (Complex) | ⏱️ ~60 minutes

関連PRの可能性

推奨ラベル

fix

ポエム

🐰 キャッシュの小道を跳ねて行く
同じ響きはもう二度と数えないよ
ふわり計算して、ぴょんと渡す
みんなで速さを祝う小さな跳躍
兎は今日もリズムを刻む

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.92% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main changes: adding IPA computation caching and batch train type queries, which are the core improvements described in the objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/fix-ipa-performance-03fEK
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
stationapi/src/use_case/interactor/query.rs (2)

620-635: tt_lookup_pairs は重複排除してから渡したいです。

同じ line_group_cd に複数の train_types があると、同一の (line_group_cd, line_cd) がその数だけ配列へ積まれます。戻り値は HashMap なので重複に意味はなく、UNNEST の入力だけが膨らんでしまいます。今回の最適化効果を落とさないためにも、ここで一度集合化してからクエリへ渡すのがよさそうです。

♻️ 例
-        let tt_lookup_pairs: Vec<(u32, u32)> = train_types
+        let tt_lookup_pairs: Vec<(u32, u32)> = train_types
             .iter()
             .filter_map(|tt| tt.line_group_cd.map(|lgc| lgc as u32))
             .flat_map(|lgc| {
                 lines
                     .iter()
                     .filter(move |l| l.line_group_cd == Some(lgc as i32))
                     .map(move |l| (lgc, l.line_cd as u32))
             })
+            .collect::<HashSet<_>>()
+            .into_iter()
             .collect();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@stationapi/src/use_case/interactor/query.rs` around lines 620 - 635,
tt_lookup_pairs currently collects duplicate (line_group_cd, line_cd) tuples
which unnecessarily expands the query input; change the building of
tt_lookup_pairs (based on train_types and lines) to deduplicate pairs before
calling self.train_type_repository.find_by_line_group_id_and_line_id_vec, e.g.
collect into a HashSet or use iterator .unique() equivalent and then convert
back to Vec so tt_lookup_pairs contains unique (u32,u32) pairs only; the rest of
the flow (tt_by_pair) can remain unchanged.

1945-1950: このモックだと batch 経路の検証が空振りになります。

get_train_types_by_station_id() の新経路は find_by_line_group_id_and_line_id_vec() に依存しますが、この実装は self.train_types を見ずに常に空 HashMap を返しています。今後この経路のテストを追加しても line.train_type が必ず None になって、本番相当の検証ができません。少なくともテストデータから (line_group_id, line_id) -> TrainType を返せる形にしておきたいです。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@stationapi/src/use_case/interactor/query.rs` around lines 1945 - 1950, The
mock implementation of find_by_line_group_id_and_line_id_vec currently always
returns an empty HashMap, causing batch-route tests to miss train_type data;
update the method (find_by_line_group_id_and_line_id_vec) to iterate the
provided key slice and build a HashMap by looking up matching entries in
self.train_types (or the mock's test-data store), mapping each (line_group_id,
line_id) to its TrainType when present, and return that map wrapped in Ok(...)
while preserving the Result<..., DomainError> signature.
stationapi/src/infrastructure/train_type_repository.rs (1)

420-504: この batch SQL には integration test を 1 本足しておきたいです。

UNNEST($1, $2) の並列ペアリングと DISTINCT ON の重複解消は、SQL を少し触っただけでも壊れやすいポイントです。重複 pair と miss を含むケースを固定しておくと、この最適化経路の回帰を拾いやすくなります。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@stationapi/src/infrastructure/train_type_repository.rs` around lines 420 -
504, Add an integration test for get_by_line_group_id_and_line_id_vec that
verifies UNNEST($1::int[], $2::int[]) pairs and DISTINCT ON deduplication: seed
the DB with types, stations and station_station_types rows so one pair has
multiple sst rows (different sst.id) and another pair has no matching stations;
call get_by_line_group_id_and_line_id_vec with a pairs array containing
duplicate pairs and a non-existent pair, then assert the returned HashMap
contains exactly one TrainType for the duplicated pair (matching the sst row
expected by ORDER BY sst.id) and no entry for the missing pair; ensure the test
uses the same DB connection setup/transaction helpers your integration tests use
so it can run isolated and roll back.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@stationapi/src/infrastructure/train_type_repository.rs`:
- Around line 420-504: Add an integration test for
get_by_line_group_id_and_line_id_vec that verifies UNNEST($1::int[], $2::int[])
pairs and DISTINCT ON deduplication: seed the DB with types, stations and
station_station_types rows so one pair has multiple sst rows (different sst.id)
and another pair has no matching stations; call
get_by_line_group_id_and_line_id_vec with a pairs array containing duplicate
pairs and a non-existent pair, then assert the returned HashMap contains exactly
one TrainType for the duplicated pair (matching the sst row expected by ORDER BY
sst.id) and no entry for the missing pair; ensure the test uses the same DB
connection setup/transaction helpers your integration tests use so it can run
isolated and roll back.

In `@stationapi/src/use_case/interactor/query.rs`:
- Around line 620-635: tt_lookup_pairs currently collects duplicate
(line_group_cd, line_cd) tuples which unnecessarily expands the query input;
change the building of tt_lookup_pairs (based on train_types and lines) to
deduplicate pairs before calling
self.train_type_repository.find_by_line_group_id_and_line_id_vec, e.g. collect
into a HashSet or use iterator .unique() equivalent and then convert back to Vec
so tt_lookup_pairs contains unique (u32,u32) pairs only; the rest of the flow
(tt_by_pair) can remain unchanged.
- Around line 1945-1950: The mock implementation of
find_by_line_group_id_and_line_id_vec currently always returns an empty HashMap,
causing batch-route tests to miss train_type data; update the method
(find_by_line_group_id_and_line_id_vec) to iterate the provided key slice and
build a HashMap by looking up matching entries in self.train_types (or the
mock's test-data store), mapping each (line_group_id, line_id) to its TrainType
when present, and return that map wrapped in Ok(...) while preserving the
Result<..., DomainError> signature.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f997eaee-faca-4d24-8063-a0b61c59f465

📥 Commits

Reviewing files that changed from the base of the PR and between 5229d7c and 73f89d6.

📒 Files selected for processing (7)
  • stationapi/src/domain/ipa.rs
  • stationapi/src/domain/repository/train_type_repository.rs
  • stationapi/src/infrastructure/train_type_repository.rs
  • stationapi/src/use_case/dto/line.rs
  • stationapi/src/use_case/dto/station.rs
  • stationapi/src/use_case/dto/train_type.rs
  • stationapi/src/use_case/interactor/query.rs

…ne_id_vec

- Collect tt_lookup_pairs into HashSet before converting to Vec to avoid
  sending duplicate (line_group_cd, line_cd) pairs to the DB query
- Update ConfigurableMockTrainTypeRepository's find_by_line_group_id_and_line_id_vec
  to actually look up matching TrainType entries by line_group_cd instead of
  always returning an empty HashMap

https://claude.ai/code/session_01C529f2ARzy4nrGEak9kgE4

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
stationapi/src/use_case/interactor/query.rs (1)

1947-1962: モック内で単体APIとバッチAPIの意味論を揃えてください。

この実装だと find_by_line_group_id_and_line_id は常に None なのに、find_by_line_group_id_and_line_id_vec は値を返すため、同一リポジトリ内で挙動が不一致です。テストが実装差分を見逃す原因になります。共通ヘルパーに寄せて、両メソッドの返却規則を一致させるのを推奨します。

差分例(挙動統一の方向性)
 impl TrainTypeRepository for ConfigurableMockTrainTypeRepository {
     async fn find_by_line_group_id_and_line_id(
         &self,
-        _: u32,
-        _: u32,
+        line_group_id: u32,
+        _line_id: u32,
     ) -> Result<Option<TrainType>, DomainError> {
-        Ok(None)
+        Ok(self
+            .train_types
+            .iter()
+            .find(|t| t.line_group_cd == Some(line_group_id as i32))
+            .cloned())
     }

     async fn find_by_line_group_id_and_line_id_vec(
         &self,
         pairs: &[(u32, u32)],
     ) -> Result<std::collections::HashMap<(u32, u32), TrainType>, DomainError> {
         let mut map = std::collections::HashMap::new();
         for &(lg, lc) in pairs {
-            if let Some(tt) = self
-                .train_types
-                .iter()
-                .find(|t| t.line_group_cd == Some(lg as i32))
-            {
+            if let Some(tt) = self.find_by_line_group_id_and_line_id(lg, lc).await? {
                 map.insert((lg, lc), tt.clone());
             }
         }
         Ok(map)
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@stationapi/src/use_case/interactor/query.rs` around lines 1947 - 1962,
現在のモックは単体メソッド find_by_line_group_id_and_line_id が常に None になる一方で、バッチ
find_by_line_group_id_and_line_id_vec
は値を返しており振る舞いが不一致です。両者を同じ返却規則に揃えるため、マッチ判定ロジックを共通ヘルパー(例:
matches_line(line_group_id, line_id,
&TrainType))に抽出し、find_by_line_group_id_and_line_id はそのヘルパーを使って Option
を返すように、find_by_line_group_id_and_line_id_vec も同じヘルパーで一致するペアのみを HashMap
に挿入するよう修正してください(現在の実装が line_id を無視しているので必ず両方をチェックすること)。
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@stationapi/src/use_case/interactor/query.rs`:
- Around line 1947-1962: 現在のモックは単体メソッド find_by_line_group_id_and_line_id が常に
None になる一方で、バッチ find_by_line_group_id_and_line_id_vec
は値を返しており振る舞いが不一致です。両者を同じ返却規則に揃えるため、マッチ判定ロジックを共通ヘルパー(例:
matches_line(line_group_id, line_id,
&TrainType))に抽出し、find_by_line_group_id_and_line_id はそのヘルパーを使って Option
を返すように、find_by_line_group_id_and_line_id_vec も同じヘルパーで一致するペアのみを HashMap
に挿入するよう修正してください(現在の実装が line_id を無視しているので必ず両方をチェックすること)。

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cab46d7d-ab16-4121-884e-0e23c023430a

📥 Commits

Reviewing files that changed from the base of the PR and between 73f89d6 and 74536f6.

📒 Files selected for processing (1)
  • stationapi/src/use_case/interactor/query.rs

claude and others added 2 commits March 20, 2026 09:13
Extract matches_pair(tt, line_group_id, line_id) that checks both
line_group_cd and line_id (via tt.lines when populated). Both
find_by_line_group_id_and_line_id and find_by_line_group_id_and_line_id_vec
now use this helper, fixing the inconsistency where the single-pair method
always returned None while the batch method returned data.

https://claude.ai/code/session_01C529f2ARzy4nrGEak9kgE4
@TinyKitten TinyKitten merged commit 0e5c23e into dev Mar 20, 2026
11 checks passed
@TinyKitten TinyKitten deleted the claude/fix-ipa-performance-03fEK branch March 20, 2026 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants