hallucination classifier and helper fixes by Darkxzie · Pull Request #211 · deepforestsci/DeepRetro

Darkxzie · 2026-04-27T16:35:37Z

Description

merged the classifier dataset prediction path into predict_probability(dataset, threshold=None) so it now returns (labels, probabilities)
added optional threshold override support for single-reaction prediction paths
normalized ML hallucination helper output so single-string pathways are returned as one-element lists
Fix #(issue)

Type of change

Please check the option that is related to your PR.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
- In this case, we recommend to discuss your modification on GitHub issues before creating the PR
Documentations (modification for documents)

Checklist

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New unit tests pass locally with my changes
I have checked my code and corrected any misspellings

shreyasvinaya · 2026-04-27T16:43:29Z

    ) -> dict[str, Any]:
        """Thin wrapper around :func:`predict_single_reaction`."""
-        return predict_single_reaction(self, product_smiles, reactants_smiles)
+        return predict_single_reaction(


you might want to remove this function in its entirety

had to add this cuz hallucination_helpers.py calls a predict_single and i have a few doubts on how to change the logic in that file so had to add this.

Please resolve this.

riya-singh28

@Darkxzie I have left a few comments after my first review.

riya-singh28 · 2026-05-04T16:13:32Z

+    >>> checker = build_ml_checker(clf)                       # doctest: +SKIP
+    >>> status, kept = checker("CCO", [["CC", "O"]])          # doctest: +SKIP
+    """
+    def _checker(product: str, pathways: list) -> tuple[int, list]:


My suggestion is to change this to a simple MLChecker class for now. It resolves issues with the nested functions and provides a clean pathway for future sub-classes, in case multiple types of MLClassifiers are added.

Class MLChecker: Init(clf): store classifier load is_valid_smiles utility Call(product, pathways): . . If valid_pathways not empty: return (200, valid_pathways) Else: return (400, [])

riya-singh28 · 2026-05-04T16:15:18Z

+                f"hallucination_mode='ml' requires a HallucinationClassifier "
+                f"or path to saved model — got {type(classifier)}"
+            )
+        return build_ml_checker(clf)


Please update this according to the suggested new class.

riya-singh28 · 2026-05-04T16:15:59Z

+    >>> checker = resolve_hallucination("heuristic", None)  # doctest: +SKIP
+    >>> callable(checker)                                    # doctest: +SKIP
+    True
+    >>> checker = resolve_hallucination("ml", "model_out/") # doctest: +SKIP


Fix doctest, remove skip, add import statements

riya-singh28

I have requested a few changes.

riya-singh28 · 2026-05-19T21:13:22Z

    ) -> dict[str, Any]:
        """Thin wrapper around :func:`predict_single_reaction`."""
-        return predict_single_reaction(self, product_smiles, reactants_smiles)
+        return predict_single_reaction(


Please resolve this.

riya-singh28 · 2026-05-19T21:22:23Z

+    assert np.all((probabilities >= 0.0) & (probabilities <= 1.0))
+
+
+def test_predict_probability_override_uses_explicit_threshold():


Add docstrings for all the unit tests

riya-singh28 · 2026-05-22T13:06:23Z

@Darkxzie Please add a test file for hallucination_helpers.py and include tests for the resolve hallucination function with model types = heuristic, ml, and none, among other required tests.

riya-singh28 · 2026-05-22T13:12:22Z

@Darkxzie Can you also check why these tests are failing? I have not seen it failing for other PRs yet.

FAILED test_adv_prompt.py::test_claude_adv_success - assert 400 == 200
FAILED test_llm.py::test_call_llm_success - assert 400 == 200

Darkxzie · 2026-05-22T18:11:38Z

@Darkxzie Can you also check why these tests are failing? I have not seen it failing for other PRs yet.
FAILED test_adv_prompt.py::test_claude_adv_success - assert 400 == 200
FAILED test_llm.py::test_call_llm_success - assert 400 == 200

Hey riya, I checked the failing run. This pr is from a fork branch, so github actions is running without the anthropic api key which is why both test_claud_adv_success and test_call_llm_success are failing. I did this because i currently do not have permission to create branches in this repo.

Darkxzie added 2 commits April 27, 2026 21:27

fix hallucination classifier and helper APIs for main

d8955aa

drop tests from main PR branch

069b4a9

Darkxzie had a problem deploying to testing April 27, 2026 16:35 — with GitHub Actions Failure

restore classifier tests to main PR branch

af488c7

Darkxzie had a problem deploying to testing April 27, 2026 16:40 — with GitHub Actions Failure

shreyasvinaya reviewed Apr 27, 2026

View reviewed changes

Darkxzie changed the title ~~Autosolve changes main~~ hallucination classifier and helper fixes Apr 27, 2026

riya-singh28 requested changes May 4, 2026

View reviewed changes

Refactor hallucination ML checker helper

5eae275

Darkxzie had a problem deploying to testing May 15, 2026 16:39 — with GitHub Actions Failure

riya-singh28 requested changes May 19, 2026

View reviewed changes

Refine hallucination checker API

ee08827

Darkxzie had a problem deploying to testing May 20, 2026 09:17 — with GitHub Actions Failure

riya-singh28 reviewed May 22, 2026

View reviewed changes

add hallucination helper tests and guard live claude tests

5e1d011

Darkxzie deployed to testing May 22, 2026 17:54 — with GitHub Actions Active

revert live claude test guards

536cc8c

Darkxzie had a problem deploying to testing May 22, 2026 17:57 — with GitHub Actions Failure

clean up hallucination helper test doubles

e9a0e36

Darkxzie had a problem deploying to testing May 22, 2026 18:09 — with GitHub Actions Failure

		assert np.all((probabilities >= 0.0) & (probabilities <= 1.0))


		def test_predict_probability_override_uses_explicit_threshold():

Conversation

Darkxzie commented Apr 27, 2026

Description

Type of change

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riya-singh28 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riya-singh28 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riya-singh28 commented May 22, 2026

Uh oh!

Darkxzie commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants