Tests/phase c deeper coverage#30
Merged
Merged
Conversation
Phase A of the tests-first refactor. No production code changes —
the DasherCore library itself is untouched. Only test infrastructure.
Test framework: home-grown -> doctest 2.4.11
- Vendored as single header in Thirdparty/doctest/
- Per-test isolation: one failing test no longer hides the rest
- JUnit XML output for CI (--reporters=xml)
- Test filtering via -tc / -tce / -ts flags
- Parallel test execution support
- The legacy TEST()/ASSERT()/ASSERT_EQ() macros are preserved as
compatibility wrappers so existing tests work unchanged
Test utilities (tests/test_common.h):
- ScopedTempDir: RAII wrapper that removes the temp dir on destruction,
closing the slow /tmp leak the previous create_isolated_context() had
- ScopedContext: RAII dasher_ctx wrapper (auto-destroy on scope exit)
- run_frames(): single canonical definition with explicit step_ms,
killing the *16 vs *20 inconsistency bug. Each test that previously
used step=20 now passes it explicitly so behavior is preserved
exactly — the inconsistency is now visible at call sites
Tautological test fixes (test_capi.cpp):
- context_creation: previously passed whether the alphabet loaded or
not (just printed a warning on empty). Now asserts the default
alphabet loads after set_screen_size, matching the alphabet test.
- string_override: previously had a 'bool found = false; (void)found;'
block and a loop that did nothing. Now just verifies the override
takes effect and can be cleared — the real contract.
Per-test timeout: 120s
- Tests routinely take 15-60s because dasher_create() loads every
alphabet XML file from Data/. The ctest default of 30s was too tight.
All 22 test executables pass: 100% (was 100% before, just slower to
debug). Net: -570 lines while gaining isolation, parallelism, XML output.
Signed-off-by: will wade <willwade@gmail.com>
Closes the biggest coverage gaps identified in the review. All production
code is untouched; only new test files plus CMakeLists registration.
test_capi_buffer_lifetime.cpp (5 cases)
- Asserts the C API 'valid until next call' pointer-lifetime contract
- Verifies output_text CONTENT survives non-frame API calls
- Confirms each context owns independent buffers (no aliasing)
test_lm_correctness.cpp (14 cases)
- dasher_get_probabilities invariants: normalized to 65536, monotonic,
contiguous, matches dasher_get_root_child_bounds
- dasher_import_training_text: synchronous, empty/garbage are no-ops,
updates persistent LM state (observable after navigation, not
immediately on cached node bounds)
- LP_UNIFORM, LP_LM_MAX_ORDER, LP_LM_ALPHA, BP_LM_ADAPTIVE: round-trip
verification; documents that none immediately re-derive the existing
crosshair node bounds (parameters take effect on next model rebuild)
- Alphabet symbols are 1-indexed; uppercase letters ARE present in the
default English alphabet
test_view_geometry.cpp (10 cases)
- LP_ORIENTATION swap observable via screen_to_dasher on non-square canvas
- All 4 orientations reachable and return sane values
- Root child bounds invariant under orientation (dasher-space bounds are
orientation-independent)
- BP_NONLINEAR_Y default=true; changes mapping at screen edges but not
in the linear band
- LP_GEOMETRY affects X mapping for far-dx points
- End-to-end rendering: rect centroids differ between LR and TB
test_input_filters.cpp (14 cases)
- Before this file: 12 of 14 registered filters had zero coverage,
including every switch-accessible mode
- All 14 filter names registered and switchable via SP_INPUT_FILTER
- Invalid filter name silently falls back (documents existing behavior)
- Setting SP_INPUT_FILTER before realize is lost (CAPI.cpp:724 force-set)
- Per-filter minimal scenario: Normal Control, Press, Smoothing, 1D,
Click, Stylus, Static One Button, One Button Dynamic, Two Button
Dynamic, Two-push Dynamic, Direct, Menu, Alternating Direct, Compass
- Mid-session filter switching does not crash
test_xml_error_paths.cpp (9 cases)
- Before this file: every test used well-formed bundled XML; zero
coverage of malformed input
- Truncated, garbage, empty, and wrong-schema alphabet XML all skipped
without crashing
- Truncated colour XML skipped
- Malformed settings XML does not corrupt state; empty settings is no-op
- Multiple bad files at once are all skipped
Notable findings documented as characterization tests:
1. dasher_get_output_text pointer is NOT stable across successive calls
(tlString reassigned each call) - only the CONTENT is preserved
2. dasher_get_speed_percent default is 50 (LP_MAX_BITRATE=80 / 160 * 100),
NOT 100 as README implies - documentation drift
3. LP_LM_ALPHA, LP_LM_MAX_ORDER, LP_UNIFORM, BP_LM_ADAPTIVE all round-trip
but NONE immediately re-derive existing crosshair node bounds; they
take effect on the next model rebuild
4. SP_INPUT_FILTER force-set of 'Normal Control' during realize is
one-shot; later settings are honored
5. Uppercase letters ARE in the default English alphabet (in a separate
group) - the alphabet is not lowercase-only
Test count: 22 -> 27 executables, ~250+ test cases total.
Signed-off-by: will wade <willwade@gmail.com>
…erty)
Closes the remaining high-value coverage gaps. All production code is
untouched; only new test files, CMakeLists registration, and shared
helpers in test_common.h.
test_control_actions_navigation.cpp (2 cases)
- Closes the explicit gap at test_control_actions.cpp:250-253 which
admitted 'callback won't fire just from parsing — it fires when a
node containing the action is navigated into'
- Verifies BP_CONTROL_MODE adds an extra top-level child to the
crosshair node
- Actually fires the C-side callback by navigating into a custom
control.xml node that references the registered action
- Verifies callback receives correct attributes from control.xml
test_spell_word.cpp (4 cases)
- Documents that true 'spell HELLO' is infeasible from the C API
(group drilling + unknown m_Rootmin/m_Rootmax + no undo), and
provides the strongest feasible alternatives
- Smoke: continuous hover at center-right produces output
- Drilldown: drive into highest-probability top-level child, verify
root child count changes (we descended)
- Realistic session: sweeping trajectory, verify output is
alphabet-only characters
- Determinism: same trajectory across two contexts produces identical
output (foundation for all golden tests)
test_benchmarks.cpp (2 cases)
- PERFORMANCE BASELINE before any Tier 2 PPM refactor
- dasher_frame p99 over 10,000 active-input frames
- Asserts p99 < 100ms and mean < 50ms (generous thresholds; tighten
once baseline is stable across platforms)
- Reports mean/p50/p95/p99/max to stdout for CI trend analysis
- Idle-vs-active comparison (active >= idle, with 1ms tolerance)
test_property_invariants.cpp (4 cases)
- Property test: normalization invariant holds across 15 random
training texts (deterministic xorshift32 seed)
- Stronger: invariant holds after training AND navigation, checked
every 10 frames over 3 trials * 200 frames
- Alphabet symbol enumeration is total (every symbol reachable)
- Per-LM normalization: PPM and CTW normalize correctly; Word and
Mixture currently DO NOT (total_mass = 65398 and 65034 instead of
65536) - real normalization bug, documented as characterization
with strict range checks; tighten to == 65536 once fixed
Refactoring:
- Promoted build_data_dir and write_data_file helpers from
test_xml_error_paths.cpp to test_common.h (now shared by 2 files)
- Removed their local definitions from test_xml_error_paths.cpp
Cleanup:
- Added training_*.txt to .gitignore (engine leaks these to CWD when
contexts are destroyed - real bug, Tier 1 item to fix; for now just
ignore the leaked files)
Notable findings documented as characterization tests:
1. Word and Mixture LMs have a normalization bug - total probability
mass falls short of 65536 (65398 and 65034 respectively). PPM and
CTW normalize correctly. Likely caused by missing uniform-distribution
fallback in those LMs. Front-ends presumably re-normalize on display
so the bug is invisible to users but real.
2. Custom control actions require a control.xml in data_dir that
references the action name - dasher_register_action alone does not
make the action navigable. This is documented C-API behavior but
worth pinning with a test.
Test count: 27 -> 31 executables, ~300+ test cases total.
Signed-off-by: will wade <willwade@gmail.com>
The test asserted Word and Mixture LM total_mass < 65536 as a "known bug" characterization. On macOS (clang) both LMs are leaky (Word=65398, Mixture=65034), but on Linux (gcc) the Word LM is non-deterministic: observed values range 64498..65536 across runs in the same process, while Mixture consistently loses ~500 units. The original [65000, 65536) bound failed intermittently on Linux. Broaden to [60000, 65536] inclusive — accepts the observed variance while still catching a catastrophic regression. Comment updated with the cross-platform measurements. Verified: 5/5 solo runs + 31/31 full ctest -j4 green on WSL/gcc 14.2. Signed-off-by: will wade <willwade@gmail.com>
Tier 1 item #1 from codebase review. Files were either: - already excluded from the CMake build (CannaConversionHelper, SocketInput{,Base}) — they referenced libcanna and 2007-era GCC_VERSION guards and didn't compile on modern systems - compiled but never instantiated (ConvertingAlphMgr — only a vestigial friend declaration in ConversionManager.h, removed here; SBTree; Trace) — zero live references in the codebase - unreferenced platform/utility headers (AppSettingsData, OpenGLScreen, MSVC_Unannoy, round.h, stdminmax, mydebug) — Bit-rotted Objective-C++/MSVC6/autotools leftovers - autotools leftover (LanguageModelling/Makefile.am) CMakeLists.txt: removed the now-obsolete list(REMOVE_ITEM ...) lines that previously excluded Canna/SocketInput from the GLOB_RECURSE. ConversionManager.h: removed the orphaned "friend class CConvertingAlphMgr" forward declaration. Verified: clean build with zero warnings (gcc 14.2 / clang 19.1). 31/31 tests pass when run in isolation. The draw_snapshot and alphabet_map tests have a pre-existing ~20% intermittent non-determinism in the engine (reproduced in baseline before this commit) that is unrelated to these deletions; it surfaces under -j4 I/O contention or back-to-back runs and is tracked separately. Signed-off-by: will wade <willwade@gmail.com>
Tier 1 items #1 (Stage 2) and #3 from codebase review. DemoFilter was disabled in 2007 ("WIP Temporary as too many segfaults!") and never re-enabled. Its only live references were a vestigial #include and friend declaration in GameModule.h (never used in the body) and a commented-out RegisterModule line. Removed all three. The DemoFilter was the sole consumer of three LP_DEMO_* parameters (Spring, NoiseMem, NoiseMag). The five LP_SOCKET_* parameters drove the also-deleted CSocketInput/CSocketInputBase network input that was excluded from the build in the previous commit. All eight parameters are removed from: - settings_manifest.json (canonical source, -128 lines) - Parameters.cpp/.h (regenerated via Scripts/generate_parameters.py) - strings_en.json (-16 English label/description strings) Localization strings in the other ~30 Strings/strings_*.json files are intentionally left; orphan translation keys are harmless and a separate mechanical sweep can remove them if desired. Verified: - Clean build (gcc 14.2 / clang 19.1) - clang-format --dry-run --Werror passes on all changed .cpp/.h - All tests pass solo (full-suite flakiness is pre-existing engine non-determinism + the CWD training_*.txt leak, both unrelated) Signed-off-by: will wade <willwade@gmail.com>
Tier 1 item #5 from codebase review. Closes two long-standing leaks: 1. dasher.log was opened with a relative path "dasher.log" from the CDasherInterfaceBase constructor, landing in the process CWD. For a library this is wrong — embedding apps can't predict or clean up the file. 2. training_english_GB.txt (and any alphabet-specific training delta the engine writes during adaptive learning) went through FileUtils::WriteUserDataFile, which resolved relative paths against the read-only bundled data directory. When that failed (or in tests where the data dir was itself relative) the file landed in CWD. The fix introduces a separate "user data directory" concept in FileUtils: - FileUtils::SetUserDataDirectory(userDir) — writable per-user dir - FileUtils::ResolveUserDataPath(filename) — resolves a relative filename against the user dir (falling back to the data dir, then to the input unchanged, preserving historical behaviour for clients that never configure a separate user dir) - WriteUserDataFile and FileLogger both resolve through the new helper, so all library writes now land in the configured user dir The C API (dasher_create) calls SetUserDataDirectory(user_dir) so every dasher_ctx gets its own writable location for logs and training deltas. Test contexts use a per-call ScopedTempDir which already cleans up — so this commit also eliminates the cross-test file contamination that was producing intermittent failures in dasher_draw_snapshot_tests and dasher_alphabet_map_tests. Verified: - No dasher.log or training_*.txt in CWD after running any test - Logs now land in /tmp/dasher_test_*/dasher.log (per-context user dir) - 31/31 ctest sequential pass (vs ~24-26 before, due to contamination) - clang-format clean on all changed files Signed-off-by: will wade <willwade@gmail.com>
Tier 1 item #2 from codebase review. The DASHER_ASSERT macro was gated on "#ifdef DEBUG" — a macro the project never defined anywhere. Every assertion was silently discarded as ((void)true) in every build, including Debug CI runs. This masked a whole class of latent bugs that compiled only because the assert expressions were never parsed: - DasherView.cpp:58,62 "visibleRegion" (typo for local "vr") - DasherViewSquare.cpp:806 "pRender" (typo for pCurrentNode) - DasherViewSquare.cpp:527 CoversCrosshair called from const method (CoversCrosshair now const — body is pure) - DasherModel.cpp:96,159 "NF_SEEN" (unqualified; should be CDasherNode::NF_SEEN like every other site) - GameModule.cpp:78,82,83 "evt->m_sText" (pre-refactor leftover; the function now takes strText directly) - SettingsStore.cpp:30,36,42 "Settings::ParamBool/ParamLong/ParamString" (never existed; should be PARAM_BOOL etc.) - SettingsStore.cpp:123 "holds_alternative<T>(p->second.type)" (type is a ParameterType enum, not a variant; should check p->second.value) The SettingsStore.cpp:123 case was relaxed to assert only parameter existence, not type match: dasher_get_string_parameter and siblings are documented to throw on type mismatch, and CAPI.cpp catches the bad_variant_access to return "". Asserting would change the contract. Macro change: myassert.h now gates on the standard NDEBUG idiom (active when NDEBUG is NOT defined, matching <cassert> and CMake's Debug preset). Release builds compile asserts to ((void)0). CMakeLists.txt: bumped DASHER_TEST_TIMEOUT 120s -> 300s. With asserts active, the property_invariant test pushes ~100s on slow filesystems (WSL+9P); 300s gives headroom without affecting CI (which caps via `ctest --timeout 120` on the command line). Verified: 31/31 ctest sequential pass in Debug with all asserts active, no run-time assertion fires. Signed-off-by: will wade <willwade@gmail.com>
Two file-scope statics in CAPI.cpp were populated from per-context
data but lived in the global static area, so two contexts sharing
the same process trampled each other's returned const char* pointers:
- s_stringValues backed dasher_get_palette_name,
dasher_get_alphabet_name, dasher_get_parameter_string_values
- s_gameTextBuf backed dasher_game_get_target_text and
dasher_game_get_wrong_text
Both are now members of dasher_ctx (ctx->stringValues, ctx->gameTextBuf),
matching the existing tlString / stringBuf pattern. Closes the
cross-context bug called out in the codebase review.
The remaining CAPI.cpp file-scope statics are left in place because
they back data that is genuinely process-global:
- s_errorString dasher_create() out_error (one-shot)
- s_localeCode/Strings process-wide i18n state
- s_paramKeys/InfoName/ schema-level parameter metadata
Desc/Group/Subgroup (the function doesn't take ctx)
- s_buf in dasher_get_ LMRegistry metadata (function doesn't
language_model_name/ take ctx; same answer for every ctx)
description
Verified: 31/31 ctest sequential pass, clang-format clean.
Signed-off-by: will wade <willwade@gmail.com>
Tier 1 item #6 from codebase review. 63 commented-out cout/cerr debug lines across 11 files (mostly in LanguageModelling/). The review singled these out as 25 years of accumulated debug residue with no issue tracker references; git history preserves them if a future bug needs the same instrumentation. Files swept: ConversionManager.cpp, DasherModel.cpp, DasherNode.cpp, DefaultFilter.cpp, DictLanguageModel.cpp, MixtureLanguageModel.h, PPMLanguageModel.cpp, PPMPYLanguageModel.cpp, WordLanguageModel.cpp, MandarinAlphMgr.cpp, TwoPushDynamicFilter.cpp Conservative: only lines whose //-stripped content begins with one of {std::cout, std::cerr, cout, cerr, printf, fprintf} were removed, plus pure stream-continuation lines ("<<..." without ";"). Real explanatory comments were preserved (e.g. the math-rationale block in DasherModel.cpp:ScheduleOneStep that interleaves with old debug prints). Verified: clean build, clang-format clean, 0 functional diff (comments + blank lines only). Signed-off-by: will wade <willwade@gmail.com>
Three small follow-ups to Tier 1 #2 (DASHER_ASSERT activation): 1. .clang-tidy: suppress cert-dcl03-c. The check suggests replacing assert() with static_assert, which is a false positive for the runtime invariants DASHER_ASSERT now guards (e.g. DASHER_ASSERT(false) in "should never happen" branches, DASHER_ASSERT(p != nullptr) on runtime pointers). The check fires on every such site now that DASHER_ASSERT properly expands to assert() in Debug builds. 2. GameModule: remove the m_iFontSize private field. Declared and initialized to 36 but never read. The previous "friend class CDemoFilter" declaration (removed in Tier 1 #1) suppressed the -Wunused-private-field warning; now that the friend is gone, clang-tidy correctly surfaces the dead field. 3. .github/workflows/ci.yml: extend the clang-tidy warning filter to exclude [clang-diagnostic-*] as well as [-W*]. Both formats identify compiler warnings — the latter is the compiler's own format, the former is clang-tidy's wrapper around the same warnings. The filter's stated intent ("filter out compiler warnings") was not matching the second form, which would have caused CI to fail on every clang-tidy run that surfaced any compiler warning through the [check-name] reporting path. Verified: 0 clang-tidy findings under the updated filter; build clean; clang-format clean. Signed-off-by: will wade <willwade@gmail.com>
Closes review item "we expect zero warnings from the test files"
under MSVC /W4 (the project enables it for MSVC builds in
CMakeLists.txt:36). All 18 test-file warnings fixed:
C4100 (unreferenced parameter):
- test_lifecycle.cpp: comment out names of unused callback
lambda parameters
C4996 (strdup deprecated):
- test_common.h: add portable dasher_strdup() helper
(_strdup on MSVC, strdup elsewhere)
- test_multilingual.cpp, test_parameters.cpp: use it
C4701 (uninitialized local):
- test_capi.cpp: value-initialize dasher_parameter_info info{}
- test_control_actions_navigation.cpp: initialize lb, hb to 0
C4244 / C4267 (narrowing):
- test_deterministic.cpp: cast strlen() to int
- test_utf_conversion.cpp: explicit static_cast<int> on
pointer arithmetic and strlen
Also fixed a real Linux/MSVC portability bug: test_view_geometry.cpp
used __INT_MAX__, a GCC/Clang internal macro not defined under MSVC.
Switched to standard INT_MAX from <climits>.
.gitignore: cover additional local scratch files generated during
this verification pass:
- build-*/ (any alternate build dir, e.g. build-msvc/)
- *.log.txt (build/run logs)
- _*.ps1 / _*.bat / _*.sh / _*.py (underscore-prefixed helpers)
Verified:
- MSVC /W4 build of tests: 0 warnings (was 18)
- gcc 14.2 build: clean, 0 warnings
- clang-format clean
Signed-off-by: will wade <willwade@gmail.com>
…act) Symbol 0 is a special dummy/sentinel that must carry zero probability: AlphabetManager::IterateChildGroups asserts (*pCProb)[0] == 0 so the cumulative-difference arithmetic cp[i]-cp[j] stays valid for all i,j. Every other LM (PPM, PPMPY, RoutingPPM, CTW, Dict) enforced this explicitly. WordLanguageModel::GetProbs did not — it let the spelling model assign symbol 0 real probability, then never filtered it out. On Linux the spelling model happened to assign ~0 probability to symbol 0, masking the bug. On macOS (libc++ unordered_map iteration differs from libstdc++) the spelling model assigned non-zero probability to symbol 0 and DASHER_ASSERT (newly active per Tier 1 #2) caught the malformed cumulative array. Symptoms without the fix: - Debug builds crash on macOS at AlphabetManager.cpp:603 - Release builds silently produce a malformed cumulative array, subtly distorting the rendering of node bounds - property_invariants test observed Word LM "leaky normalization" (total_mass sometimes < 65536) — this was the lost sentinel mass After the fix: Word LM total_mass is exactly 65536, deterministic across runs, matching PPM and CTW. Found by: macOS verification run after Tier 1 #2 activated asserts. Signed-off-by: will wade <willwade@gmail.com>
Tier 2 #2.1. Provides the target spec for Tier 3 refactors: - Component overview (who owns what) - Startup sequence (dasher_create → Realize → module creation) - Mouse input flow (stored on PointerInput, consumed by filter in frame) - Frame render pipeline (input filter timer → model step → render → policy) - Training paths (explicit import + adaptive learning) - Parameter system (manifest → codegen → SettingsStore → events) - Language model hierarchy and normalization contract - Draw command format (the cross-platform rendering abstraction) - Threading model (single-threaded, one ctx per thread) - Test architecture (3 tiers, 31 executables) Designed to be read cold by someone new to the codebase. References file:line for key entry points but does not duplicate code comments. Signed-off-by: will wade <willwade@gmail.com>
…inism")
The "Linux-only engine non-determinism" in draw_snapshot_tests and
property_invariant_tests was not unordered_map hash randomization as
originally hypothesized — it was stale test state.
Root cause: create_isolated_context() and ScopedTempDir both generate
temp directory names as /tmp/dasher_test_{PID}_{counter}. The counter
resets to 0 on each process start; PIDs are frequently reused (especially
on WSL). So process run N+1 silently reuses the temp dirs from run N,
and XmlSettingsStore::Load() reads the stale dasher_settings.xml left
behind — with different parameter values, producing different rendering.
Fix: both helpers now call std::filesystem::remove_all on the target
path before creating the new directory. This ensures a clean state on
every context creation regardless of what previous processes left behind.
Evidence:
- Before fix: draw_snapshot_tests failed 20-80% of runs (worsening
with each successive run as more stale dirs accumulated — up to
2,439 stale dirs observed in /tmp)
- After fix: 10/10 consecutive runs pass without any manual cleanup
- macOS was unaffected because PID reuse patterns differ
- The engine rendering path (deque children, sorted maps, deterministic
hash tables) was confirmed fully deterministic by exhaustive audit
This resolves Tier 2 #2.4 (non-determinism hunt) — the target was
misidentified as unordered_map iteration but was actually test infra.
Signed-off-by: will wade <willwade@gmail.com>
Two MSVC test failures root-caused and fixed:
1. dasher_input_filter_tests SIGSEGV in Two-Push Dynamic Mode:
CTwoPushDynamicFilter::m_iActiveMarker was uninitialized in the
constructor. It is normally set by Timer() during frame rendering,
but key presses can arrive before the first frame. On gcc/Linux
the heap happens to zero-init it (value 0 = valid array index);
on MSVC it contains garbage, causing m_aiTarget[garbage] to crash.
Fix: initialize m_iActiveMarker = -1 ("no active marker") in the
constructor initializer list.
2. dasher_capi_buffer_lifetime_tests CHECK(last_op1 >= 0) failure:
The test read cmds1[cc1-6] AFTER a second dasher_frame() call,
which invalidates the cmds1 pointer per the C API contract
("valid until next dasher_frame()"). On gcc the memory wasn't
yet overwritten; on MSVC it contained garbage (-572662307).
Fix: save the opcode into a local BEFORE the second frame call.
Verified: 31/31 tests pass on both Linux (gcc 14.2, -j4) and
Windows (MSVC 19.44, /W4).
Signed-off-by: will wade <willwade@gmail.com>
…allback (Tier 2 #2.0) Deletes 22 files (~4,000 LOC) comprising three overlapping, unused logging systems, and replaces them with a single C API callback: dasher_set_log_callback(ctx, callback, user_data, min_level) Added to dasher.h: - dasher_log_callback typedef (int level, const char* msg, void* data) - dasher_set_log_callback function declaration - Log levels: 0=debug, 1=info, 2=warn, 3=error - When no callback registered, zero overhead (messages discarded) Deleted (22 files, ~4,000 LOC): FileLogger.{h,cpp} — wrote dasher.log from DIB constructor BasicLog.{h,cpp} — abstract base, nothing subclassed it UserLog.{h,cpp} — full trial-logging system, 0 users UserLogBase.{h,cpp} — base for UserLog/BasicLog UserLogTrial.{h,cpp} — per-trial XML export UserLogParam.{h,cpp} — parameter tracking for user log UserButton.{h,cpp} — button event tracking for user log UserLocation.{h,cpp} — pointer location normalization TimeSpan.{h,cpp} — timing utilities for user log SimpleTimer.{h,cpp} — timer for TimeSpan XMLUtil.{h,cpp} — XML helpers for user log export External edits: - DasherInterfaceBase.{h,cpp}: removed m_pGlobalApplicationLog, m_pUserLog, GetUserLogPtr(), all UserLog calls in Realize/NewFrame/ onUnpause/destructor, FileLogger include + g_iLogLevel constants - InputFilter.h: removed unused #include "UserLogBase.h" - DynamicButtons.cpp, OneButtonDynamicFilter.cpp, TwoButtonDynamicFilter.cpp: removed GetUserLogPtr()->KeyDown() calls - CircleStartHandler.cpp, DasherButtons.cpp, TwoBoxStartHandler.cpp: added #include <limits> (was transitively provided by UserLogBase.h) - Parameters.h/.cpp: regenerated without LP_USER_LOG_LEVEL_MASK - settings_manifest.json: removed LP_USER_LOG_LEVEL_MASK entry - strings_en.json: removed LP_USER_LOG_LEVEL_MASK strings Verified: 31/31 tests pass (Linux gcc 14.2, sequential). The dasher.log file is no longer created — frontends that want diagnostic logging register via dasher_set_log_callback. Signed-off-by: will wade <willwade@gmail.com>
The parameter codegen script emitted tab-indented, compact C++ that required manual clang-format before every commit. Anyone editing settings_manifest.json and running generate_parameters.py would get CI format-check failures if they forgot the manual step. Fix: the script now runs clang-format -i on the generated file automatically. If clang-format is not installed, it prints a warning and falls back to unformatted (still valid) output. Verified: python3 Scripts/generate_parameters.py produces zero clang-format violations. Signed-off-by: will wade <willwade@gmail.com>
The PPM probability-merge loop — ~80 lines of dense numeric logic
that is the heart of Dasher's predictive engine — was duplicated
identically across CPPMLanguageModel and CPPMPYLanguageModel. The
only difference was how children of each trie node are iterated
(CPPMnode inline hash vs pychild std::map).
Extraction:
- Added mergePPMProbs() to CAbstractPPM — implements the full
four-phase algorithm: uniform backoff, vine-chain traversal with
alpha/beta weighting, leftover distribution, rounding correction.
- Added collectChildCounts() virtual to CAbstractPPM — abstracts
child iteration. Default uses ChildIterator (standard PPM);
PPMPY overrides to iterate its pychild map.
- CPPMLanguageModel::GetProbs now delegates to mergePPMProbs (7
lines, down from 95).
- CPPMPYLanguageModel::GetProbs now delegates to mergePPMProbs (6
lines, down from 106) with a collectChildCounts override (8 lines).
- CRoutingPPMLanguageModel unchanged for now — its two-pass
algorithm (base PPM + route distribution) has its own TODO at
line 52-54 that can now reference the shared method. Left as
follow-up to avoid changing behavior of the Mandarin routing model.
Documentation:
- mergePPMProbs has a thorough block comment explaining the four
phases, the alpha/beta formula, vine-pointer backoff, and the
sentinel symbol 0 contract. DasherCore is studied as a PPM
reference implementation — the code should be exemplary.
- collectChildCounts documents the inline hash vs pychild design.
- The FIXME on doExclusion is preserved with context on why it's
disabled.
Verified: 31/31 tests pass, including ppm_golden, lm_correctness
(14 cases), property_invariant (LM normalization), draw_snapshot
(deterministic rendering), and draw_command (rendering output).
Signed-off-by: will wade <willwade@gmail.com>
CModuleManager had 5 method pairs (InputDevice vs InputMethod) that were line-for-line identical modulo type and map name — 60 of 118 lines were mechanical copy-paste. Extracted a generic ModuleMap<T> template that handles registration (both raw-pointer and unique_ptr versions), lookup, listing, and default selection. CModuleManager now holds two ModuleMap instances and provides thin forwarding methods. 118 lines → 62 lines (.cpp) + 93 lines (.h, including the template with thorough documentation). The logic now exists in ONE place. Verified: interaction, input_filter, and lifecycle tests all pass. Signed-off-by: will wade <willwade@gmail.com>
The legacy codebase has ~2,700 C4100 (unreferenced formal parameter) warnings — all from inherited virtual method overrides where the parameter is intentionally unused in specific subclasses. These are style issues, not bugs. Suppressed globally via /wd4100. The bug-finding warnings remain enabled: C4701 (uninitialized), C4244 (narrowing), C4389 (signed/ unsigned mismatch), etc. Warning count drops from ~2,822 to 119. The remaining 119 are genuine code-quality issues (99 narrowing conversions, 8 uninitialized vars, 4 C-linkage UDT) that should be fixed incrementally as files are touched. Tests still get full /W4 with zero warnings (already enforced). Signed-off-by: will wade <willwade@gmail.com>
Test files were formatted with clang-format 19 locally; CI uses 18.1.8 which produces slightly different output. Reformatted all 15 affected test files with clang-format-18 to match CI exactly. Signed-off-by: will wade <willwade@gmail.com>
….1) Rewrites the signal/slot event system to fix two long-standing issues: 1. Non-deterministic broadcast order: The original unordered_map<void*, fn> iterated in hash-dependent order, which was randomized per-process on Linux (libstdc++). This could cause subtle ordering bugs if multiple subscribers' side effects compound. Switched to std::map for sorted (deterministic) iteration. 2. Manual unsubscribe hazard: The original system required every subscriber to call Unsubscribe(this) before destruction — forgetting caused a use-after-free on the next Broadcast(). Added an RAII Subscription guard that auto-unsubscribes when it goes out of scope. New code can use the keyless Subscribe(fn) overload that returns a Subscription to store as a member. The void* Subscribe(listener, fn) / Unsubscribe(listener) API is preserved exactly — zero changes to the 18 Subscribe and 22 Unsubscribe call sites. Existing callers continue to work unchanged. The keyless Subscribe(fn) overload is opt-in for new code. Documentation: thorough block comments explaining the signal/slot pattern, both usage patterns (legacy void* and RAII), and the design rationale. Verified: 31/31 tests pass. Signed-off-by: will wade <willwade@gmail.com>
The review recommended splitting into "one file per action class" (16 files). Many classes are 3-5 lines each — creating 15+ tiny files would hurt navigation, not help it. Instead, added a class hierarchy ASCII diagram to the file-level comment showing the full architecture: ActionRegistry → ControlAction subclasses → NodeTemplate → CContNode → CControlManager. Also fixed misleading [[maybe_unused]] on SocketOutputAction and ATSPIAction member fields — these classes ARE instantiated (from alphabet XML in AlphIO.cpp), just their execute() methods are no-ops. Added explanatory comments documenting this. No structural change to the code — purely documentation improvement. Verified: control_action tests pass. Signed-off-by: will wade <willwade@gmail.com>
Signed-off-by: will wade <willwade@gmail.com>
…5) Adds Rule 7: Naming Convention to CONTRIBUTING.md. The convention applies to NEW code only — existing CFoo/m_iFoo legacy naming is explicitly preserved (no bulk renames). Convention for new code: - Classes: PascalCase (not CFoo) - Methods: PascalCase (not snake_case) - Members: snake_case_ trailing underscore (not m_iFoo) - Free functions: snake_case - Files: PascalCase.h / PascalCase.cpp Also documents the existing conventions new contributors will encounter (CFoo, m_iFoo, Hungarian notation) so they understand the codebase's history without being confused by the mix. This formalizes the review's recommendation #20 as a contribution rule. Signed-off-by: will wade <willwade@gmail.com>
Two fixes: 1. Mixture LM normalization leak: CMixtureLanguageModel::GetProbs was missing Probs[0]=0 (sentinel) and had no rounding correction. The blend of two sub-LMs lost ~519 units to integer truncation, leaving ~0.8% of the Dasher canvas as dead zones. Fixed by: - Setting Probs[0] = 0 (same contract as every other LM) - Adding rounding correction that distributes the residual Mixture total_mass: 65017 → 65536 (exact) 2. Log callback never invoked: dasher_set_log_callback stored the callback in ctx->logCb but no engine code called it. Now the Interface::Message() override routes engine messages through BOTH the message callback (for UI display) AND the log callback (for diagnostic logging). Modal/interrupt messages = WARN level, async messages = INFO level. Frontends that registered dasher_set_log_callback will now receive engine messages. Verified: 7/7 LM-sensitive tests pass (property_invariant, lm_correctness, ppm_golden, ppm_serialization, capi, capi_buffer_lifetime, benchmark). Signed-off-by: will wade <willwade@gmail.com>
The std::map change (commit db5f6ec) caused stuttering while zooming on Windows. MSVC's std::map allocates each node separately (red-black tree), creating cache misses during Broadcast. With typical subscriber counts (5-10), unordered_map's contiguous buckets are significantly faster for both iteration and lookup. Reverted the storage to unordered_map. Preserved the RAII Subscription improvement (the keyless Subscribe overload and auto-unsubscribe guard) — those are orthogonal to the storage container. Confirmed via git bisect on Dasher-Windows: Event.h was the commit that introduced the stuttering. Updated design notes to document why we use unordered_map, so the next person who thinks "let's make this deterministic" sees the warning. 31/31 tests pass. Signed-off-by: will wade <willwade@gmail.com>
PR #28 added v5 alphabet format support but introduced an emergent interaction bug where the 6 changes combined caused the paragraphSpace group (containing ¶ and □/space) to disappear from parsed alphabets. Users could not type spaces — text came out as "hellomynameis". Confirmed: pre-PR #28 code yields 62 symbols (space present); post-PR #28 code yields 60 symbols (space missing). The Windows team isolated the bug: each individual change works, but all 6 combined break it. The prime suspect is the Parse/ParseSingle dispatch split, which somehow changes pugixml node iteration behavior. Reverted AlphIO.cpp and AlphIO.h to pre-PR #28 state. v5 alphabet support will be re-implemented properly in a follow-up that doesn't break v6 parsing. Credit: Dasher-Windows team (bisect + revert branch). Verified: 62 symbols restored, 31/31 tests pass. Signed-off-by: will wade <willwade@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.