Add segmented TTS metadata for mixed-language names by TinyKitten · Pull Request #1432 · TrainLCD/StationAPI

TinyKitten · 2026-03-15T15:38:54Z

Summary

add segmented TTS metadata for Station, StationMinimal, Line, and TrainType
preserve existing IPA fields while introducing per-token TTS segments for mixed-language names
update the gRPCProto submodule and IPA handling to support mixed-language TTS construction

Verification

cargo fmt --check
SQLX_OFFLINE=true cargo build -p stationapi
SQLX_OFFLINE=true cargo test -p stationapi

Summary by CodeRabbit

ドキュメント
- TTS メタデータ公開に関する説明を追加
新機能
- Station、StationMinimal、Line、TrainType が name_tts_segments を公開し、各セグメント単位の発音データ（表記、フォールバック、発音、言語、区切り）を取得可能に
テスト
- 混在言語の駅名に対する TTS セグメント分割の検証テストを追加

coderabbitai · 2026-03-15T15:39:11Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ae0df02c-81f1-4ce9-bc65-078248de3ea0

📥 Commits

Reviewing files that changed from the base of the PR and between 6af58fc and 8fc46e0.

📒 Files selected for processing (3)

stationapi/src/domain/ipa.rs
stationapi/src/use_case/dto/line.rs
stationapi/src/use_case/dto/station.rs

📝 Walkthrough

Walkthrough

Text-to-Speech（TTS）メタデータをStation、StationMinimal、Line、TrainTypeエンドポイントに公開するため、名前のセグメント化APIを追加し、プロトバッファ型に対応するserde派生や DTO/インタラクタでの name_tts_segments 統合を行った。

Changes

Cohort / File(s)	Summary
ドキュメント更新 `AGENTS.md`	gRPC エンドポイント概要に TTS メタデータ公開（name_tts_segments）についての記述を追加。
ビルド設定 `stationapi/build.rs`	生成型 `TtsAlphabet` と `TtsSegment` に `Serialize/Deserialize` 派生属性を追加。
Protoサブモジュール `stationapi/proto`	サブプロジェクトのコミットポインタを更新。
IPA/TTS コア処理 `stationapi/src/domain/ipa.rs`	駅名→IPA パイプラインをセグメントベースへリファクタ。`TtsAlphabetKind`/`TtsNameSegment` 型と `station_name_to_tts_segments` を追加し、セグメント単位での発音データ生成を実装。
DTO モジュール公開 `stationapi/src/use_case/dto.rs`	新規 `pub mod tts;` を公開。
DTO 変換処理 `stationapi/src/use_case/dto/line.rs`, `stationapi/src/use_case/dto/station.rs`, `stationapi/src/use_case/dto/train_type.rs`	各エンティティの gRPC 変換に `name_tts_segments` を追加し、`station_name_to_tts_segments` と `to_proto_tts_segments` を用いて設定。
TTS 変換ユーティリティ `stationapi/src/use_case/dto/tts.rs`	`Vec<TtsNameSegment>` を `Vec<TtsSegment>` に変換する `to_proto_tts_segments` を追加。
クエリ/インタラクタ `stationapi/src/use_case/interactor/query.rs`	`get_routes` と `get_routes_minimal` で TTS セグメントを計算し、`StationMinimal` に `name_tts_segments` を設定。

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Interactor as UseCase/Interactor
  participant Domain as domain::ipa
  participant DTO as use_case::dto
  participant Proto as generated::proto

  Client->>Interactor: get_routes / get_routes_minimal
  Interactor->>Domain: station_name_to_tts_segments(name_k, name_r)
  Domain-->>Interactor: Vec<TtsNameSegment>
  Interactor->>DTO: to_proto_tts_segments(segments)
  DTO-->>Interactor: Vec<TtsSegment>
  Interactor->>Proto: build Station/Line/TrainType with name_tts_segments
  Interactor-->>Client: gRPC response (includes name_tts_segments)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

fix: 路線名IPAから共通接尾辞(線/本線/支線)を除去 #1412: 同ファイル群（src/domain/ipa.rs、DTO 変換）へ変更を加え、名前処理前処理（suffix 削除など）を導入している点で密接に関連。
Fix IPA handling for Meitetsu station names #1431: stationapi/src/domain/ipa.rs の IPA/トークナイズ処理を変更しており、セグメント化ロジックとの衝突や共通化が想定される。
Add katakana to IPA conversion for SSML phoneme tags #1410: 既存の katakana→IPA 経路や name_ipa フィールド追加に関する変更があり、新しいセグメント API と影響範囲が重なる。

Suggested labels

feature

Poem

🐰 セグメントひとつずつことばをつなぎ
英語も日本語もリズムに乗せて
駅の名前が小さな歌になる
ピョンと跳ねて、声が届くよ 🎶

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 22.73% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	プルリクエストのタイトルは、変更内容の主要な目的を明確に反映しており、混合言語名向けのセグメント化されたTTSメタデータ追加という中心的な変更を正確に要約しています。

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/ipa-meitetsu-stations

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

# Conflicts: # stationapi/src/domain/ipa.rs # stationapi/src/use_case/dto/line.rs # stationapi/src/use_case/dto/station.rs # stationapi/src/use_case/dto/train_type.rs

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

stationapi/src/use_case/dto/station.rs (1)
189-191: セグメント[1]のpronunciationフィールドの検証が不足しています。

Rinkaiセグメントはsurface、fallback_text、separatorを検証していますが、pronunciationの検証がありません。他のセグメントと同様に、このフィールドも検証することで、TTSセグメント生成ロジックのより完全なカバレッジが得られます。
💡 pronunciationアサーションの追加提案
         assert_eq!(grpc_station.name_tts_segments[1].surface, "Rinkai");
         assert_eq!(grpc_station.name_tts_segments[1].fallback_text, "りんかい");
+        assert_eq!(grpc_station.name_tts_segments[1].pronunciation, "ɾiŋka.i"); // 期待される発音を確認してください
         assert_eq!(grpc_station.name_tts_segments[1].separator, " ");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@stationapi/src/use_case/dto/station.rs` around lines 189 - 191, テストで
grpc_station の TTS セグメント検証が抜けているため、name_tts_segments[1] の pronunciation
フィールドも検証するアサーションを追加してください（対象は
grpc_station.name_tts_segments[1].pronunciation）；期待値は既存のテストデータ／他のセグメントで使っている発音フォーマットに合わせて設定し、Rinkai
セグメントの surface/fallback_text/separator と同様に確実に比較するようにします。

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@stationapi/src/domain/ipa.rs`:
- Around line 255-263: The code computes stem_len from normalized byte lengths
and slices original by that byte index, which can split UTF-8 chars; instead
compute the stem boundaries in characters and map to a byte index in original:
determine stem_char_count = normalized.chars().count() - suffix.chars().count()
(or find the char-index where suffix begins), then get the byte offset in
original via original.char_indices().nth(stem_char_count).map(|(i,_)|
i).unwrap_or(original.len()) and use that byte offset for
&original[..byte_offset]; then pass that safe UTF-8 stem to
word_to_tts_segments(stem) and leave suffix handling unchanged (references:
JAPANESE_SUFFIXES, normalized, original, stem_len, word_to_tts_segments).

In `@stationapi/src/use_case/dto/line.rs`:
- Around line 20-23: name_tts_segments
のフォールバックが無効になっているので、station_name_to_tts_segments
に空文字列を渡すのではなくカタカナフォールバック用の値を渡してください: 修正箇所は
to_proto_tts_segments(station_name_to_tts_segments(...)) の呼び出しで、第一引数の "" を
line.line_name_k に差し替え、line.line_name_r はそのまま第二引数に渡してカタカナフォールバックを有効にしてください（参照:
name_tts_segments, station_name_to_tts_segments, line.line_name_k,
line.line_name_r）。

---

Nitpick comments:
In `@stationapi/src/use_case/dto/station.rs`:
- Around line 189-191: テストで grpc_station の TTS
セグメント検証が抜けているため、name_tts_segments[1] の pronunciation
フィールドも検証するアサーションを追加してください（対象は
grpc_station.name_tts_segments[1].pronunciation）；期待値は既存のテストデータ／他のセグメントで使っている発音フォーマットに合わせて設定し、Rinkai
セグメントの surface/fallback_text/separator と同様に確実に比較するようにします。

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a1b60049-1777-46bf-a8db-e389e6c33b4b

📥 Commits

Reviewing files that changed from the base of the PR and between f80be56 and 6af58fc.

📒 Files selected for processing (10)

AGENTS.md
stationapi/build.rs
stationapi/proto
stationapi/src/domain/ipa.rs
stationapi/src/use_case/dto.rs
stationapi/src/use_case/dto/line.rs
stationapi/src/use_case/dto/station.rs
stationapi/src/use_case/dto/train_type.rs
stationapi/src/use_case/dto/tts.rs
stationapi/src/use_case/interactor/query.rs

TinyKitten added 5 commits March 15, 2026 12:41

Fix IPA handling for Meitetsu station names

39c32a8

Preserve katakana fallback for invalid romanized IPA

a2d899f

Add segmented TTS metadata for mixed-language names

fa016f3

Update gRPCProto submodule for TtsAlphabet enum naming

c91257f

Update gRPCProto submodule

170e7ad

github-actions Bot added the fix 直した label Mar 15, 2026

Merge remote-tracking branch 'origin/dev' into fix/ipa-meitetsu-stations

6af58fc

# Conflicts: # stationapi/src/domain/ipa.rs # stationapi/src/use_case/dto/line.rs # stationapi/src/use_case/dto/station.rs # stationapi/src/use_case/dto/train_type.rs

TinyKitten self-assigned this Mar 15, 2026

coderabbitai Bot requested changes Mar 15, 2026

View reviewed changes

Comment thread stationapi/src/domain/ipa.rs

Comment thread stationapi/src/use_case/dto/line.rs

Fix TTS segment fallback and UTF-8 slicing

8fc46e0

coderabbitai Bot approved these changes Mar 15, 2026

View reviewed changes

TinyKitten merged commit a1c4253 into dev Mar 15, 2026
11 checks passed

TinyKitten deleted the fix/ipa-meitetsu-stations branch March 15, 2026 16:08

coderabbitai Bot mentioned this pull request Mar 20, 2026

Optimize IPA computation with caching and batch train type queries #1434

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add segmented TTS metadata for mixed-language names#1432

Add segmented TTS metadata for mixed-language names#1432
TinyKitten merged 7 commits into
devfrom
fix/ipa-meitetsu-stations

TinyKitten commented Mar 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 15, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

TinyKitten commented Mar 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

TinyKitten commented Mar 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 15, 2026 •

edited

Loading