Skip to content

Disable MiniSat simplifier after first incremental solve#8851

Draft
tautschnig wants to merge 8 commits intodiffblue:developfrom
tautschnig:bugfixes/no-incremental-simp
Draft

Disable MiniSat simplifier after first incremental solve#8851
tautschnig wants to merge 8 commits intodiffblue:developfrom
tautschnig:bugfixes/no-incremental-simp

Conversation

@tautschnig
Copy link
Collaborator

MiniSat's SimpSolver runs variable elimination (eliminate()) before every solveLimited/solve call by default. When used incrementally in the all-properties verification loop, this repeated elimination can degrade the solver's ability to prove UNSAT, causing the CDCL search to hang indefinitely.

Fix by passing do_simp=false on all calls after the first. The first call still benefits from the full simplification pass, but subsequent incremental calls skip it, preventing the problematic variable elimination between iterations.

This fixes a hang when running:
cbmc --arrays-uf-always --unwind 1 --32 Array_UF23/main.c

  • Each commit message has a non-empty body, explaining why the change was made.
  • n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a hang in CBMC's incremental SAT solving when using MiniSat's SimpSolver. The issue occurs because MiniSat's simplifier runs variable elimination (eliminate()) before every solveLimited/solve call, which can degrade the solver's ability to prove UNSAT during the all-properties verification loop. The fix passes do_simp=false on all calls after the first, allowing the initial full simplification pass but skipping it on subsequent incremental calls.

Changes:

  • Added a solver_was_called flag to satcheck_minisat2_baset to track whether the solver has been called before, and use if constexpr to pass do_simp=false to SimpSolver::solveLimited/solve on subsequent calls
  • Reordered includes to follow the .clang-format priority ordering and added <type_traits> for std::is_same_v
  • Added a regression test (Array_UF23) that reproduces the hang scenario with --arrays-uf-always --unwind 1 --32

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
src/solvers/sat/satcheck_minisat2.h Added solver_was_called boolean member with in-class default initializer
src/solvers/sat/satcheck_minisat2.cpp Used if constexpr to conditionally disable SimpSolver's simplifier after first solve, reordered includes
regression/cbmc/Array_UF23/main.c New regression test C source with VLA array stores and an assertion
regression/cbmc/Array_UF23/test.desc Test descriptor for the regression test

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.01%. Comparing base (20a9052) to head (c8ca115).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8851      +/-   ##
===========================================
- Coverage    80.01%   80.01%   -0.01%     
===========================================
  Files         1700     1700              
  Lines       188338   188347       +9     
  Branches        73       73              
===========================================
+ Hits        150702   150707       +5     
- Misses       37636    37640       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@kroening
Copy link
Collaborator

kroening commented Mar 5, 2026

Is this a genuine issue in MiniSat, or are we using it wrong?

@tautschnig
Copy link
Collaborator Author

It's a known limitation of MiniSat's SimpSolver when used incrementally. The SimpSolver was designed primarily for single-shot solving — its solve/solveLimited methods run eliminate() before every call by default. While frozen variables are protected from elimination, the variable elimination and subsumption passes restructure the clause database between incremental calls, which can make subsequent UNSAT proofs much harder for the CDCL search.

MiniSat's API does provide the do_simp parameter precisely for this use case (the parameter defaults to true). We just weren't using it. Other incremental SAT solver interfaces (CaDiCaL, etc.) don't have this issue because they don't run preprocessing between incremental calls unless explicitly asked to.

@tautschnig tautschnig force-pushed the bugfixes/no-incremental-simp branch 3 times, most recently from 75c8bd0 to 59fef53 Compare March 8, 2026 19:48
@tautschnig tautschnig marked this pull request as draft March 9, 2026 09:44
tautschnig and others added 8 commits March 9, 2026 14:17
MiniSat's SimpSolver runs variable elimination (eliminate()) before
every solveLimited/solve call by default. When used incrementally in
the all-properties verification loop, this repeated elimination can
degrade the solver's ability to prove UNSAT, causing the CDCL search
to hang indefinitely.

Fix by passing do_simp=false on all calls after the first. The first
call still benefits from the full simplification pass, but subsequent
incremental calls skip it, preventing the problematic variable
elimination between iterations.

This fixes a hang when running:
  cbmc --arrays-uf-always --unwind 1 --32 Array_UF23/main.c

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Add TIMEOUT 1200 to the set_tests_properties call in both copies of the
add_test_pl_profile macro (regression/CMakeLists.txt and
regression/libcprover-cpp/CMakeLists.txt). This ensures CTest will kill
and report as failed any regression test that exceeds 20 minutes,
preventing CI jobs from hanging indefinitely on tests that time out.

Co-authored-by: Kiro (autonomous agent) <kiro-agent@users.noreply.github.com>
Fork the test command into a new process group and use alarm() to enforce
an optional per-test timeout. When the timeout fires, kill the entire
process group with SIGKILL and report the test as failed with a TIMEOUT
marker.

The timeout can be set via -t <secs> on the command line or via the
TESTPL_TIMEOUT environment variable (command line takes precedence).

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
These two test suites regularly exceed the default 1200-second timeout
on macOS CI runners. Set a 2700-second (45-minute) timeout for
jbmc-symex-driven-lazy-loading-CORE and
jbmc-strings-symex-driven-lazy-loading-CORE.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
contracts-dfcc-CORE and book-examples-cprover-smt2-CORE exceeded the
default 1200-second timeout on macOS 14 ARM runners. Set 3600-second
(1-hour) timeouts for these suites. Also bump jbmc symex-driven
lazy-loading timeouts from 2700 to 3600 seconds as they still timed
out at the previous limit.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Temporarily disable all CI jobs except check-ubuntu-22_04-cmake-gcc-32bit
to focus debugging on the timeout issue. Changes to the 32-bit job:

- Add timeout-minutes: 45 at the GitHub Actions job level to prevent
  the 6-hour default timeout from wasting CI resources.
- Set TESTPL_TIMEOUT=600 environment variable so test.pl kills any
  individual test that runs longer than 10 minutes.
- Add --timeout 1200 to the ctest command line as a belt-and-suspenders
  measure alongside the CMakeLists.txt TIMEOUT property.
- Add --output-on-failure to ctest for better diagnostic output.

The CTest TIMEOUT 1200 property (from CMakeLists.txt) was already present
in the last CI run but did not prevent the 6-hour hang, suggesting CTest
may not be killing orphaned child processes when it terminates test.pl.
The TESTPL_TIMEOUT mechanism should be more reliable as test.pl kills the
entire process group (kill -9 -$pid).

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
The previous unconditional disabling of the simplifier after the first
incremental solve caused Multi_Dimensional_Array6 to hang in the 32-bit
CI build. The simplifier is beneficial for most workloads; only the
Ackermann-style array encoding (--arrays-uf-always) generates the
problematic pattern of many incremental calls where variable elimination
between solves degrades CDCL performance.

Make the behaviour opt-in via set_limit_incremental_simplification() on
the MiniSat solver, and activate it from solver_factory when arrays-uf
is set to "always".

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Re-enable all CI jobs and remove the temporary timeout diagnostics
now that the root cause has been identified and fixed.

The 32-bit CI hang was caused by Multi_Dimensional_Array6/test.desc
timing out because the MiniSat simplifier was unconditionally disabled
after the first incremental solve. The fix in the previous commit
scopes this behaviour to --arrays-uf-always only.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig force-pushed the bugfixes/no-incremental-simp branch from e60ecaa to c8ca115 Compare March 9, 2026 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants