Tests/phase c deeper coverage by willwade · Pull Request #30 · dasher-project/DasherCore

willwade · 2026-06-23T21:55:58Z

No description provided.

Phase A of the tests-first refactor. No production code changes — the DasherCore library itself is untouched. Only test infrastructure. Test framework: home-grown -> doctest 2.4.11 - Vendored as single header in Thirdparty/doctest/ - Per-test isolation: one failing test no longer hides the rest - JUnit XML output for CI (--reporters=xml) - Test filtering via -tc / -tce / -ts flags - Parallel test execution support - The legacy TEST()/ASSERT()/ASSERT_EQ() macros are preserved as compatibility wrappers so existing tests work unchanged Test utilities (tests/test_common.h): - ScopedTempDir: RAII wrapper that removes the temp dir on destruction, closing the slow /tmp leak the previous create_isolated_context() had - ScopedContext: RAII dasher_ctx wrapper (auto-destroy on scope exit) - run_frames(): single canonical definition with explicit step_ms, killing the *16 vs *20 inconsistency bug. Each test that previously used step=20 now passes it explicitly so behavior is preserved exactly — the inconsistency is now visible at call sites Tautological test fixes (test_capi.cpp): - context_creation: previously passed whether the alphabet loaded or not (just printed a warning on empty). Now asserts the default alphabet loads after set_screen_size, matching the alphabet test. - string_override: previously had a 'bool found = false; (void)found;' block and a loop that did nothing. Now just verifies the override takes effect and can be cleared — the real contract. Per-test timeout: 120s - Tests routinely take 15-60s because dasher_create() loads every alphabet XML file from Data/. The ctest default of 30s was too tight. All 22 test executables pass: 100% (was 100% before, just slower to debug). Net: -570 lines while gaining isolation, parallelism, XML output. Signed-off-by: will wade <willwade@gmail.com>

Closes the biggest coverage gaps identified in the review. All production code is untouched; only new test files plus CMakeLists registration. test_capi_buffer_lifetime.cpp (5 cases) - Asserts the C API 'valid until next call' pointer-lifetime contract - Verifies output_text CONTENT survives non-frame API calls - Confirms each context owns independent buffers (no aliasing) test_lm_correctness.cpp (14 cases) - dasher_get_probabilities invariants: normalized to 65536, monotonic, contiguous, matches dasher_get_root_child_bounds - dasher_import_training_text: synchronous, empty/garbage are no-ops, updates persistent LM state (observable after navigation, not immediately on cached node bounds) - LP_UNIFORM, LP_LM_MAX_ORDER, LP_LM_ALPHA, BP_LM_ADAPTIVE: round-trip verification; documents that none immediately re-derive the existing crosshair node bounds (parameters take effect on next model rebuild) - Alphabet symbols are 1-indexed; uppercase letters ARE present in the default English alphabet test_view_geometry.cpp (10 cases) - LP_ORIENTATION swap observable via screen_to_dasher on non-square canvas - All 4 orientations reachable and return sane values - Root child bounds invariant under orientation (dasher-space bounds are orientation-independent) - BP_NONLINEAR_Y default=true; changes mapping at screen edges but not in the linear band - LP_GEOMETRY affects X mapping for far-dx points - End-to-end rendering: rect centroids differ between LR and TB test_input_filters.cpp (14 cases) - Before this file: 12 of 14 registered filters had zero coverage, including every switch-accessible mode - All 14 filter names registered and switchable via SP_INPUT_FILTER - Invalid filter name silently falls back (documents existing behavior) - Setting SP_INPUT_FILTER before realize is lost (CAPI.cpp:724 force-set) - Per-filter minimal scenario: Normal Control, Press, Smoothing, 1D, Click, Stylus, Static One Button, One Button Dynamic, Two Button Dynamic, Two-push Dynamic, Direct, Menu, Alternating Direct, Compass - Mid-session filter switching does not crash test_xml_error_paths.cpp (9 cases) - Before this file: every test used well-formed bundled XML; zero coverage of malformed input - Truncated, garbage, empty, and wrong-schema alphabet XML all skipped without crashing - Truncated colour XML skipped - Malformed settings XML does not corrupt state; empty settings is no-op - Multiple bad files at once are all skipped Notable findings documented as characterization tests: 1. dasher_get_output_text pointer is NOT stable across successive calls (tlString reassigned each call) - only the CONTENT is preserved 2. dasher_get_speed_percent default is 50 (LP_MAX_BITRATE=80 / 160 * 100), NOT 100 as README implies - documentation drift 3. LP_LM_ALPHA, LP_LM_MAX_ORDER, LP_UNIFORM, BP_LM_ADAPTIVE all round-trip but NONE immediately re-derive existing crosshair node bounds; they take effect on the next model rebuild 4. SP_INPUT_FILTER force-set of 'Normal Control' during realize is one-shot; later settings are honored 5. Uppercase letters ARE in the default English alphabet (in a separate group) - the alphabet is not lowercase-only Test count: 22 -> 27 executables, ~250+ test cases total. Signed-off-by: will wade <willwade@gmail.com>

…erty) Closes the remaining high-value coverage gaps. All production code is untouched; only new test files, CMakeLists registration, and shared helpers in test_common.h. test_control_actions_navigation.cpp (2 cases) - Closes the explicit gap at test_control_actions.cpp:250-253 which admitted 'callback won't fire just from parsing — it fires when a node containing the action is navigated into' - Verifies BP_CONTROL_MODE adds an extra top-level child to the crosshair node - Actually fires the C-side callback by navigating into a custom control.xml node that references the registered action - Verifies callback receives correct attributes from control.xml test_spell_word.cpp (4 cases) - Documents that true 'spell HELLO' is infeasible from the C API (group drilling + unknown m_Rootmin/m_Rootmax + no undo), and provides the strongest feasible alternatives - Smoke: continuous hover at center-right produces output - Drilldown: drive into highest-probability top-level child, verify root child count changes (we descended) - Realistic session: sweeping trajectory, verify output is alphabet-only characters - Determinism: same trajectory across two contexts produces identical output (foundation for all golden tests) test_benchmarks.cpp (2 cases) - PERFORMANCE BASELINE before any Tier 2 PPM refactor - dasher_frame p99 over 10,000 active-input frames - Asserts p99 < 100ms and mean < 50ms (generous thresholds; tighten once baseline is stable across platforms) - Reports mean/p50/p95/p99/max to stdout for CI trend analysis - Idle-vs-active comparison (active >= idle, with 1ms tolerance) test_property_invariants.cpp (4 cases) - Property test: normalization invariant holds across 15 random training texts (deterministic xorshift32 seed) - Stronger: invariant holds after training AND navigation, checked every 10 frames over 3 trials * 200 frames - Alphabet symbol enumeration is total (every symbol reachable) - Per-LM normalization: PPM and CTW normalize correctly; Word and Mixture currently DO NOT (total_mass = 65398 and 65034 instead of 65536) - real normalization bug, documented as characterization with strict range checks; tighten to == 65536 once fixed Refactoring: - Promoted build_data_dir and write_data_file helpers from test_xml_error_paths.cpp to test_common.h (now shared by 2 files) - Removed their local definitions from test_xml_error_paths.cpp Cleanup: - Added training_*.txt to .gitignore (engine leaks these to CWD when contexts are destroyed - real bug, Tier 1 item to fix; for now just ignore the leaked files) Notable findings documented as characterization tests: 1. Word and Mixture LMs have a normalization bug - total probability mass falls short of 65536 (65398 and 65034 respectively). PPM and CTW normalize correctly. Likely caused by missing uniform-distribution fallback in those LMs. Front-ends presumably re-normalize on display so the bug is invisible to users but real. 2. Custom control actions require a control.xml in data_dir that references the action name - dasher_register_action alone does not make the action navigable. This is documented C-API behavior but worth pinning with a test. Test count: 27 -> 31 executables, ~300+ test cases total. Signed-off-by: will wade <willwade@gmail.com>

The test asserted Word and Mixture LM total_mass < 65536 as a "known bug" characterization. On macOS (clang) both LMs are leaky (Word=65398, Mixture=65034), but on Linux (gcc) the Word LM is non-deterministic: observed values range 64498..65536 across runs in the same process, while Mixture consistently loses ~500 units. The original [65000, 65536) bound failed intermittently on Linux. Broaden to [60000, 65536] inclusive — accepts the observed variance while still catching a catastrophic regression. Comment updated with the cross-platform measurements. Verified: 5/5 solo runs + 31/31 full ctest -j4 green on WSL/gcc 14.2. Signed-off-by: will wade <willwade@gmail.com>

Tier 1 item #1 from codebase review. Files were either: - already excluded from the CMake build (CannaConversionHelper, SocketInput{,Base}) — they referenced libcanna and 2007-era GCC_VERSION guards and didn't compile on modern systems - compiled but never instantiated (ConvertingAlphMgr — only a vestigial friend declaration in ConversionManager.h, removed here; SBTree; Trace) — zero live references in the codebase - unreferenced platform/utility headers (AppSettingsData, OpenGLScreen, MSVC_Unannoy, round.h, stdminmax, mydebug) — Bit-rotted Objective-C++/MSVC6/autotools leftovers - autotools leftover (LanguageModelling/Makefile.am) CMakeLists.txt: removed the now-obsolete list(REMOVE_ITEM ...) lines that previously excluded Canna/SocketInput from the GLOB_RECURSE. ConversionManager.h: removed the orphaned "friend class CConvertingAlphMgr" forward declaration. Verified: clean build with zero warnings (gcc 14.2 / clang 19.1). 31/31 tests pass when run in isolation. The draw_snapshot and alphabet_map tests have a pre-existing ~20% intermittent non-determinism in the engine (reproduced in baseline before this commit) that is unrelated to these deletions; it surfaces under -j4 I/O contention or back-to-back runs and is tracked separately. Signed-off-by: will wade <willwade@gmail.com>

Tier 1 items #1 (Stage 2) and #3 from codebase review. DemoFilter was disabled in 2007 ("WIP Temporary as too many segfaults!") and never re-enabled. Its only live references were a vestigial #include and friend declaration in GameModule.h (never used in the body) and a commented-out RegisterModule line. Removed all three. The DemoFilter was the sole consumer of three LP_DEMO_* parameters (Spring, NoiseMem, NoiseMag). The five LP_SOCKET_* parameters drove the also-deleted CSocketInput/CSocketInputBase network input that was excluded from the build in the previous commit. All eight parameters are removed from: - settings_manifest.json (canonical source, -128 lines) - Parameters.cpp/.h (regenerated via Scripts/generate_parameters.py) - strings_en.json (-16 English label/description strings) Localization strings in the other ~30 Strings/strings_*.json files are intentionally left; orphan translation keys are harmless and a separate mechanical sweep can remove them if desired. Verified: - Clean build (gcc 14.2 / clang 19.1) - clang-format --dry-run --Werror passes on all changed .cpp/.h - All tests pass solo (full-suite flakiness is pre-existing engine non-determinism + the CWD training_*.txt leak, both unrelated) Signed-off-by: will wade <willwade@gmail.com>

Tier 1 item #5 from codebase review. Closes two long-standing leaks: 1. dasher.log was opened with a relative path "dasher.log" from the CDasherInterfaceBase constructor, landing in the process CWD. For a library this is wrong — embedding apps can't predict or clean up the file. 2. training_english_GB.txt (and any alphabet-specific training delta the engine writes during adaptive learning) went through FileUtils::WriteUserDataFile, which resolved relative paths against the read-only bundled data directory. When that failed (or in tests where the data dir was itself relative) the file landed in CWD. The fix introduces a separate "user data directory" concept in FileUtils: - FileUtils::SetUserDataDirectory(userDir) — writable per-user dir - FileUtils::ResolveUserDataPath(filename) — resolves a relative filename against the user dir (falling back to the data dir, then to the input unchanged, preserving historical behaviour for clients that never configure a separate user dir) - WriteUserDataFile and FileLogger both resolve through the new helper, so all library writes now land in the configured user dir The C API (dasher_create) calls SetUserDataDirectory(user_dir) so every dasher_ctx gets its own writable location for logs and training deltas. Test contexts use a per-call ScopedTempDir which already cleans up — so this commit also eliminates the cross-test file contamination that was producing intermittent failures in dasher_draw_snapshot_tests and dasher_alphabet_map_tests. Verified: - No dasher.log or training_*.txt in CWD after running any test - Logs now land in /tmp/dasher_test_*/dasher.log (per-context user dir) - 31/31 ctest sequential pass (vs ~24-26 before, due to contamination) - clang-format clean on all changed files Signed-off-by: will wade <willwade@gmail.com>

Tier 1 item #2 from codebase review. The DASHER_ASSERT macro was gated on "#ifdef DEBUG" — a macro the project never defined anywhere. Every assertion was silently discarded as ((void)true) in every build, including Debug CI runs. This masked a whole class of latent bugs that compiled only because the assert expressions were never parsed: - DasherView.cpp:58,62 "visibleRegion" (typo for local "vr") - DasherViewSquare.cpp:806 "pRender" (typo for pCurrentNode) - DasherViewSquare.cpp:527 CoversCrosshair called from const method (CoversCrosshair now const — body is pure) - DasherModel.cpp:96,159 "NF_SEEN" (unqualified; should be CDasherNode::NF_SEEN like every other site) - GameModule.cpp:78,82,83 "evt->m_sText" (pre-refactor leftover; the function now takes strText directly) - SettingsStore.cpp:30,36,42 "Settings::ParamBool/ParamLong/ParamString" (never existed; should be PARAM_BOOL etc.) - SettingsStore.cpp:123 "holds_alternative<T>(p->second.type)" (type is a ParameterType enum, not a variant; should check p->second.value) The SettingsStore.cpp:123 case was relaxed to assert only parameter existence, not type match: dasher_get_string_parameter and siblings are documented to throw on type mismatch, and CAPI.cpp catches the bad_variant_access to return "". Asserting would change the contract. Macro change: myassert.h now gates on the standard NDEBUG idiom (active when NDEBUG is NOT defined, matching <cassert> and CMake's Debug preset). Release builds compile asserts to ((void)0). CMakeLists.txt: bumped DASHER_TEST_TIMEOUT 120s -> 300s. With asserts active, the property_invariant test pushes ~100s on slow filesystems (WSL+9P); 300s gives headroom without affecting CI (which caps via `ctest --timeout 120` on the command line). Verified: 31/31 ctest sequential pass in Debug with all asserts active, no run-time assertion fires. Signed-off-by: will wade <willwade@gmail.com>

Two file-scope statics in CAPI.cpp were populated from per-context data but lived in the global static area, so two contexts sharing the same process trampled each other's returned const char* pointers: - s_stringValues backed dasher_get_palette_name, dasher_get_alphabet_name, dasher_get_parameter_string_values - s_gameTextBuf backed dasher_game_get_target_text and dasher_game_get_wrong_text Both are now members of dasher_ctx (ctx->stringValues, ctx->gameTextBuf), matching the existing tlString / stringBuf pattern. Closes the cross-context bug called out in the codebase review. The remaining CAPI.cpp file-scope statics are left in place because they back data that is genuinely process-global: - s_errorString dasher_create() out_error (one-shot) - s_localeCode/Strings process-wide i18n state - s_paramKeys/InfoName/ schema-level parameter metadata Desc/Group/Subgroup (the function doesn't take ctx) - s_buf in dasher_get_ LMRegistry metadata (function doesn't language_model_name/ take ctx; same answer for every ctx) description Verified: 31/31 ctest sequential pass, clang-format clean. Signed-off-by: will wade <willwade@gmail.com>

Tier 1 item #6 from codebase review. 63 commented-out cout/cerr debug lines across 11 files (mostly in LanguageModelling/). The review singled these out as 25 years of accumulated debug residue with no issue tracker references; git history preserves them if a future bug needs the same instrumentation. Files swept: ConversionManager.cpp, DasherModel.cpp, DasherNode.cpp, DefaultFilter.cpp, DictLanguageModel.cpp, MixtureLanguageModel.h, PPMLanguageModel.cpp, PPMPYLanguageModel.cpp, WordLanguageModel.cpp, MandarinAlphMgr.cpp, TwoPushDynamicFilter.cpp Conservative: only lines whose //-stripped content begins with one of {std::cout, std::cerr, cout, cerr, printf, fprintf} were removed, plus pure stream-continuation lines ("<<..." without ";"). Real explanatory comments were preserved (e.g. the math-rationale block in DasherModel.cpp:ScheduleOneStep that interleaves with old debug prints). Verified: clean build, clang-format clean, 0 functional diff (comments + blank lines only). Signed-off-by: will wade <willwade@gmail.com>

Three small follow-ups to Tier 1 #2 (DASHER_ASSERT activation): 1. .clang-tidy: suppress cert-dcl03-c. The check suggests replacing assert() with static_assert, which is a false positive for the runtime invariants DASHER_ASSERT now guards (e.g. DASHER_ASSERT(false) in "should never happen" branches, DASHER_ASSERT(p != nullptr) on runtime pointers). The check fires on every such site now that DASHER_ASSERT properly expands to assert() in Debug builds. 2. GameModule: remove the m_iFontSize private field. Declared and initialized to 36 but never read. The previous "friend class CDemoFilter" declaration (removed in Tier 1 #1) suppressed the -Wunused-private-field warning; now that the friend is gone, clang-tidy correctly surfaces the dead field. 3. .github/workflows/ci.yml: extend the clang-tidy warning filter to exclude [clang-diagnostic-*] as well as [-W*]. Both formats identify compiler warnings — the latter is the compiler's own format, the former is clang-tidy's wrapper around the same warnings. The filter's stated intent ("filter out compiler warnings") was not matching the second form, which would have caused CI to fail on every clang-tidy run that surfaced any compiler warning through the [check-name] reporting path. Verified: 0 clang-tidy findings under the updated filter; build clean; clang-format clean. Signed-off-by: will wade <willwade@gmail.com>

Closes review item "we expect zero warnings from the test files" under MSVC /W4 (the project enables it for MSVC builds in CMakeLists.txt:36). All 18 test-file warnings fixed: C4100 (unreferenced parameter): - test_lifecycle.cpp: comment out names of unused callback lambda parameters C4996 (strdup deprecated): - test_common.h: add portable dasher_strdup() helper (_strdup on MSVC, strdup elsewhere) - test_multilingual.cpp, test_parameters.cpp: use it C4701 (uninitialized local): - test_capi.cpp: value-initialize dasher_parameter_info info{} - test_control_actions_navigation.cpp: initialize lb, hb to 0 C4244 / C4267 (narrowing): - test_deterministic.cpp: cast strlen() to int - test_utf_conversion.cpp: explicit static_cast<int> on pointer arithmetic and strlen Also fixed a real Linux/MSVC portability bug: test_view_geometry.cpp used __INT_MAX__, a GCC/Clang internal macro not defined under MSVC. Switched to standard INT_MAX from <climits>. .gitignore: cover additional local scratch files generated during this verification pass: - build-*/ (any alternate build dir, e.g. build-msvc/) - *.log.txt (build/run logs) - _*.ps1 / _*.bat / _*.sh / _*.py (underscore-prefixed helpers) Verified: - MSVC /W4 build of tests: 0 warnings (was 18) - gcc 14.2 build: clean, 0 warnings - clang-format clean Signed-off-by: will wade <willwade@gmail.com>

…act) Symbol 0 is a special dummy/sentinel that must carry zero probability: AlphabetManager::IterateChildGroups asserts (*pCProb)[0] == 0 so the cumulative-difference arithmetic cp[i]-cp[j] stays valid for all i,j. Every other LM (PPM, PPMPY, RoutingPPM, CTW, Dict) enforced this explicitly. WordLanguageModel::GetProbs did not — it let the spelling model assign symbol 0 real probability, then never filtered it out. On Linux the spelling model happened to assign ~0 probability to symbol 0, masking the bug. On macOS (libc++ unordered_map iteration differs from libstdc++) the spelling model assigned non-zero probability to symbol 0 and DASHER_ASSERT (newly active per Tier 1 #2) caught the malformed cumulative array. Symptoms without the fix: - Debug builds crash on macOS at AlphabetManager.cpp:603 - Release builds silently produce a malformed cumulative array, subtly distorting the rendering of node bounds - property_invariants test observed Word LM "leaky normalization" (total_mass sometimes < 65536) — this was the lost sentinel mass After the fix: Word LM total_mass is exactly 65536, deterministic across runs, matching PPM and CTW. Found by: macOS verification run after Tier 1 #2 activated asserts. Signed-off-by: will wade <willwade@gmail.com>

Tier 2 #2.1. Provides the target spec for Tier 3 refactors: - Component overview (who owns what) - Startup sequence (dasher_create → Realize → module creation) - Mouse input flow (stored on PointerInput, consumed by filter in frame) - Frame render pipeline (input filter timer → model step → render → policy) - Training paths (explicit import + adaptive learning) - Parameter system (manifest → codegen → SettingsStore → events) - Language model hierarchy and normalization contract - Draw command format (the cross-platform rendering abstraction) - Threading model (single-threaded, one ctx per thread) - Test architecture (3 tiers, 31 executables) Designed to be read cold by someone new to the codebase. References file:line for key entry points but does not duplicate code comments. Signed-off-by: will wade <willwade@gmail.com>

…inism") The "Linux-only engine non-determinism" in draw_snapshot_tests and property_invariant_tests was not unordered_map hash randomization as originally hypothesized — it was stale test state. Root cause: create_isolated_context() and ScopedTempDir both generate temp directory names as /tmp/dasher_test_{PID}_{counter}. The counter resets to 0 on each process start; PIDs are frequently reused (especially on WSL). So process run N+1 silently reuses the temp dirs from run N, and XmlSettingsStore::Load() reads the stale dasher_settings.xml left behind — with different parameter values, producing different rendering. Fix: both helpers now call std::filesystem::remove_all on the target path before creating the new directory. This ensures a clean state on every context creation regardless of what previous processes left behind. Evidence: - Before fix: draw_snapshot_tests failed 20-80% of runs (worsening with each successive run as more stale dirs accumulated — up to 2,439 stale dirs observed in /tmp) - After fix: 10/10 consecutive runs pass without any manual cleanup - macOS was unaffected because PID reuse patterns differ - The engine rendering path (deque children, sorted maps, deterministic hash tables) was confirmed fully deterministic by exhaustive audit This resolves Tier 2 #2.4 (non-determinism hunt) — the target was misidentified as unordered_map iteration but was actually test infra. Signed-off-by: will wade <willwade@gmail.com>

Two MSVC test failures root-caused and fixed: 1. dasher_input_filter_tests SIGSEGV in Two-Push Dynamic Mode: CTwoPushDynamicFilter::m_iActiveMarker was uninitialized in the constructor. It is normally set by Timer() during frame rendering, but key presses can arrive before the first frame. On gcc/Linux the heap happens to zero-init it (value 0 = valid array index); on MSVC it contains garbage, causing m_aiTarget[garbage] to crash. Fix: initialize m_iActiveMarker = -1 ("no active marker") in the constructor initializer list. 2. dasher_capi_buffer_lifetime_tests CHECK(last_op1 >= 0) failure: The test read cmds1[cc1-6] AFTER a second dasher_frame() call, which invalidates the cmds1 pointer per the C API contract ("valid until next dasher_frame()"). On gcc the memory wasn't yet overwritten; on MSVC it contained garbage (-572662307). Fix: save the opcode into a local BEFORE the second frame call. Verified: 31/31 tests pass on both Linux (gcc 14.2, -j4) and Windows (MSVC 19.44, /W4). Signed-off-by: will wade <willwade@gmail.com>

…allback (Tier 2 #2.0) Deletes 22 files (~4,000 LOC) comprising three overlapping, unused logging systems, and replaces them with a single C API callback: dasher_set_log_callback(ctx, callback, user_data, min_level) Added to dasher.h: - dasher_log_callback typedef (int level, const char* msg, void* data) - dasher_set_log_callback function declaration - Log levels: 0=debug, 1=info, 2=warn, 3=error - When no callback registered, zero overhead (messages discarded) Deleted (22 files, ~4,000 LOC): FileLogger.{h,cpp} — wrote dasher.log from DIB constructor BasicLog.{h,cpp} — abstract base, nothing subclassed it UserLog.{h,cpp} — full trial-logging system, 0 users UserLogBase.{h,cpp} — base for UserLog/BasicLog UserLogTrial.{h,cpp} — per-trial XML export UserLogParam.{h,cpp} — parameter tracking for user log UserButton.{h,cpp} — button event tracking for user log UserLocation.{h,cpp} — pointer location normalization TimeSpan.{h,cpp} — timing utilities for user log SimpleTimer.{h,cpp} — timer for TimeSpan XMLUtil.{h,cpp} — XML helpers for user log export External edits: - DasherInterfaceBase.{h,cpp}: removed m_pGlobalApplicationLog, m_pUserLog, GetUserLogPtr(), all UserLog calls in Realize/NewFrame/ onUnpause/destructor, FileLogger include + g_iLogLevel constants - InputFilter.h: removed unused #include "UserLogBase.h" - DynamicButtons.cpp, OneButtonDynamicFilter.cpp, TwoButtonDynamicFilter.cpp: removed GetUserLogPtr()->KeyDown() calls - CircleStartHandler.cpp, DasherButtons.cpp, TwoBoxStartHandler.cpp: added #include <limits> (was transitively provided by UserLogBase.h) - Parameters.h/.cpp: regenerated without LP_USER_LOG_LEVEL_MASK - settings_manifest.json: removed LP_USER_LOG_LEVEL_MASK entry - strings_en.json: removed LP_USER_LOG_LEVEL_MASK strings Verified: 31/31 tests pass (Linux gcc 14.2, sequential). The dasher.log file is no longer created — frontends that want diagnostic logging register via dasher_set_log_callback. Signed-off-by: will wade <willwade@gmail.com>

The parameter codegen script emitted tab-indented, compact C++ that required manual clang-format before every commit. Anyone editing settings_manifest.json and running generate_parameters.py would get CI format-check failures if they forgot the manual step. Fix: the script now runs clang-format -i on the generated file automatically. If clang-format is not installed, it prints a warning and falls back to unformatted (still valid) output. Verified: python3 Scripts/generate_parameters.py produces zero clang-format violations. Signed-off-by: will wade <willwade@gmail.com>

The PPM probability-merge loop — ~80 lines of dense numeric logic that is the heart of Dasher's predictive engine — was duplicated identically across CPPMLanguageModel and CPPMPYLanguageModel. The only difference was how children of each trie node are iterated (CPPMnode inline hash vs pychild std::map). Extraction: - Added mergePPMProbs() to CAbstractPPM — implements the full four-phase algorithm: uniform backoff, vine-chain traversal with alpha/beta weighting, leftover distribution, rounding correction. - Added collectChildCounts() virtual to CAbstractPPM — abstracts child iteration. Default uses ChildIterator (standard PPM); PPMPY overrides to iterate its pychild map. - CPPMLanguageModel::GetProbs now delegates to mergePPMProbs (7 lines, down from 95). - CPPMPYLanguageModel::GetProbs now delegates to mergePPMProbs (6 lines, down from 106) with a collectChildCounts override (8 lines). - CRoutingPPMLanguageModel unchanged for now — its two-pass algorithm (base PPM + route distribution) has its own TODO at line 52-54 that can now reference the shared method. Left as follow-up to avoid changing behavior of the Mandarin routing model. Documentation: - mergePPMProbs has a thorough block comment explaining the four phases, the alpha/beta formula, vine-pointer backoff, and the sentinel symbol 0 contract. DasherCore is studied as a PPM reference implementation — the code should be exemplary. - collectChildCounts documents the inline hash vs pychild design. - The FIXME on doExclusion is preserved with context on why it's disabled. Verified: 31/31 tests pass, including ppm_golden, lm_correctness (14 cases), property_invariant (LM normalization), draw_snapshot (deterministic rendering), and draw_command (rendering output). Signed-off-by: will wade <willwade@gmail.com>

CModuleManager had 5 method pairs (InputDevice vs InputMethod) that were line-for-line identical modulo type and map name — 60 of 118 lines were mechanical copy-paste. Extracted a generic ModuleMap<T> template that handles registration (both raw-pointer and unique_ptr versions), lookup, listing, and default selection. CModuleManager now holds two ModuleMap instances and provides thin forwarding methods. 118 lines → 62 lines (.cpp) + 93 lines (.h, including the template with thorough documentation). The logic now exists in ONE place. Verified: interaction, input_filter, and lifecycle tests all pass. Signed-off-by: will wade <willwade@gmail.com>

The legacy codebase has ~2,700 C4100 (unreferenced formal parameter) warnings — all from inherited virtual method overrides where the parameter is intentionally unused in specific subclasses. These are style issues, not bugs. Suppressed globally via /wd4100. The bug-finding warnings remain enabled: C4701 (uninitialized), C4244 (narrowing), C4389 (signed/ unsigned mismatch), etc. Warning count drops from ~2,822 to 119. The remaining 119 are genuine code-quality issues (99 narrowing conversions, 8 uninitialized vars, 4 C-linkage UDT) that should be fixed incrementally as files are touched. Tests still get full /W4 with zero warnings (already enforced). Signed-off-by: will wade <willwade@gmail.com>

Test files were formatted with clang-format 19 locally; CI uses 18.1.8 which produces slightly different output. Reformatted all 15 affected test files with clang-format-18 to match CI exactly. Signed-off-by: will wade <willwade@gmail.com>

….1) Rewrites the signal/slot event system to fix two long-standing issues: 1. Non-deterministic broadcast order: The original unordered_map<void*, fn> iterated in hash-dependent order, which was randomized per-process on Linux (libstdc++). This could cause subtle ordering bugs if multiple subscribers' side effects compound. Switched to std::map for sorted (deterministic) iteration. 2. Manual unsubscribe hazard: The original system required every subscriber to call Unsubscribe(this) before destruction — forgetting caused a use-after-free on the next Broadcast(). Added an RAII Subscription guard that auto-unsubscribes when it goes out of scope. New code can use the keyless Subscribe(fn) overload that returns a Subscription to store as a member. The void* Subscribe(listener, fn) / Unsubscribe(listener) API is preserved exactly — zero changes to the 18 Subscribe and 22 Unsubscribe call sites. Existing callers continue to work unchanged. The keyless Subscribe(fn) overload is opt-in for new code. Documentation: thorough block comments explaining the signal/slot pattern, both usage patterns (legacy void* and RAII), and the design rationale. Verified: 31/31 tests pass. Signed-off-by: will wade <willwade@gmail.com>

The review recommended splitting into "one file per action class" (16 files). Many classes are 3-5 lines each — creating 15+ tiny files would hurt navigation, not help it. Instead, added a class hierarchy ASCII diagram to the file-level comment showing the full architecture: ActionRegistry → ControlAction subclasses → NodeTemplate → CContNode → CControlManager. Also fixed misleading [[maybe_unused]] on SocketOutputAction and ATSPIAction member fields — these classes ARE instantiated (from alphabet XML in AlphIO.cpp), just their execute() methods are no-ops. Added explanatory comments documenting this. No structural change to the code — purely documentation improvement. Verified: control_action tests pass. Signed-off-by: will wade <willwade@gmail.com>

Signed-off-by: will wade <willwade@gmail.com>

…5) Adds Rule 7: Naming Convention to CONTRIBUTING.md. The convention applies to NEW code only — existing CFoo/m_iFoo legacy naming is explicitly preserved (no bulk renames). Convention for new code: - Classes: PascalCase (not CFoo) - Methods: PascalCase (not snake_case) - Members: snake_case_ trailing underscore (not m_iFoo) - Free functions: snake_case - Files: PascalCase.h / PascalCase.cpp Also documents the existing conventions new contributors will encounter (CFoo, m_iFoo, Hungarian notation) so they understand the codebase's history without being confused by the mix. This formalizes the review's recommendation #20 as a contribution rule. Signed-off-by: will wade <willwade@gmail.com>

Two fixes: 1. Mixture LM normalization leak: CMixtureLanguageModel::GetProbs was missing Probs[0]=0 (sentinel) and had no rounding correction. The blend of two sub-LMs lost ~519 units to integer truncation, leaving ~0.8% of the Dasher canvas as dead zones. Fixed by: - Setting Probs[0] = 0 (same contract as every other LM) - Adding rounding correction that distributes the residual Mixture total_mass: 65017 → 65536 (exact) 2. Log callback never invoked: dasher_set_log_callback stored the callback in ctx->logCb but no engine code called it. Now the Interface::Message() override routes engine messages through BOTH the message callback (for UI display) AND the log callback (for diagnostic logging). Modal/interrupt messages = WARN level, async messages = INFO level. Frontends that registered dasher_set_log_callback will now receive engine messages. Verified: 7/7 LM-sensitive tests pass (property_invariant, lm_correctness, ppm_golden, ppm_serialization, capi, capi_buffer_lifetime, benchmark). Signed-off-by: will wade <willwade@gmail.com>

The std::map change (commit db5f6ec) caused stuttering while zooming on Windows. MSVC's std::map allocates each node separately (red-black tree), creating cache misses during Broadcast. With typical subscriber counts (5-10), unordered_map's contiguous buckets are significantly faster for both iteration and lookup. Reverted the storage to unordered_map. Preserved the RAII Subscription improvement (the keyless Subscribe overload and auto-unsubscribe guard) — those are orthogonal to the storage container. Confirmed via git bisect on Dasher-Windows: Event.h was the commit that introduced the stuttering. Updated design notes to document why we use unordered_map, so the next person who thinks "let's make this deterministic" sees the warning. 31/31 tests pass. Signed-off-by: will wade <willwade@gmail.com>

PR #28 added v5 alphabet format support but introduced an emergent interaction bug where the 6 changes combined caused the paragraphSpace group (containing ¶ and □/space) to disappear from parsed alphabets. Users could not type spaces — text came out as "hellomynameis". Confirmed: pre-PR #28 code yields 62 symbols (space present); post-PR #28 code yields 60 symbols (space missing). The Windows team isolated the bug: each individual change works, but all 6 combined break it. The prime suspect is the Parse/ParseSingle dispatch split, which somehow changes pugixml node iteration behavior. Reverted AlphIO.cpp and AlphIO.h to pre-PR #28 state. v5 alphabet support will be re-implemented properly in a follow-up that doesn't break v6 parsing. Credit: Dasher-Windows team (bisect + revert branch). Verified: 62 symbols restored, 31/31 tests pass. Signed-off-by: will wade <willwade@gmail.com>

willwade added 29 commits June 22, 2026 09:51

style: fix Event.h clang-format 18 formatting

d514262

Signed-off-by: will wade <willwade@gmail.com>

willwade merged commit 937e688 into main Jun 24, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tests/phase c deeper coverage#30

Tests/phase c deeper coverage#30
willwade merged 29 commits into
mainfrom
tests/phase-c-deeper-coverage

willwade commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

willwade commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant