Skip to content

draft autosolve#183

Open
redrodeo03 wants to merge 21 commits into
mainfrom
autosolve
Open

draft autosolve#183
redrodeo03 wants to merge 21 commits into
mainfrom
autosolve

Conversation

@redrodeo03
Copy link
Copy Markdown
Collaborator

@redrodeo03 redrodeo03 commented Mar 27, 2026

Description

Fix #(issue)

image

we still need
cache.py
parse.py
rec_prithvi.py
prithvi.py
llm.py
from src/

Type of change

Please check the option that is related to your PR.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • In this case, we recommend to discuss your modification on GitHub issues before creating the PR
  • Documentations (modification for documents)

Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New unit tests pass locally with my changes
  • I have checked my code and corrected any misspellings

Comment thread deepretro/utils/autosolve.py Outdated
@@ -0,0 +1,204 @@
"""Batch and single-molecule retrosynthesis runners.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to algorithms

Comment thread deepretro/utils/autosolve.py Outdated
smiles: str,
*,
llm: str = "anthropic/claude-3-7-sonnet-20250219:adv",
az_model: str = "Pistachio_100+",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PR we need aizynthfinder

Comment thread deepretro/utils/autosolve.py Outdated

Returns
-------
dict[str, Any]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More details on output format please

Comment thread deepretro/utils/autosolve.py Outdated
from src.utils.job_context import logger as context_logger
from src.utils.parse import format_output

def _run(smiles, **kwargs):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do this...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it a public function; avoid "private" functions

Comment thread deepretro/utils/autosolve.py Outdated
)

stability_flag = str(stability_check)
semaphore = asyncio.Semaphore(max_concurrent)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to document this thoroughly

Comment thread deepretro/utils/autosolve.py Outdated
hallucination_mode, hallucination_classifier,
)

stability_flag = str(stability_check)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this? Document

Comment thread deepretro/utils/autosolve.py Outdated
semaphore = asyncio.Semaphore(max_concurrent)
loop = asyncio.get_running_loop()

async def _process(smi: str) -> dict[str, Any]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings

Comment thread deepretro/algorithms/autosolve.py Outdated
def get_pipeline():
"""Lazy-import the retrosynthesis pipeline and return a runner callable.

Imports are deferred so that ``import deepretro.algorithms.autosolve``
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how heavy is the aizynthfinder import?

also i'm not sure if the LLM libraries are heavy, they cause issues if they are not installed

Copy link
Copy Markdown
Collaborator

@shreyasvinaya shreyasvinaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some simple fixes like usage examples, type annotations have to be added, rst docs to be added

some feedback on autosolve

Comment thread deepretro/algorithms/autosolve.py Outdated
from deepretro.migration.job_context import logger as context_logger
from deepretro.migration.parse import format_output

def run_retrosynthesis(smiles, **kwargs):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usage example, type annotations

Comment thread deepretro/algorithms/autosolve.py Outdated
def autosolve(
smiles: str,
*,
llm: str = "anthropic/claude-3-7-sonnet-20250219:adv",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch this to opus 4.6 as default

Comment thread deepretro/algorithms/autosolve.py Outdated
az_model: str = "Pistachio_100+",
stability_check: bool = True,
hallucination_mode: str = "heuristic",
hallucination_classifier: Any = None,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid using any for typechecks

Comment thread deepretro/algorithms/autosolve.py Outdated
)


async def autosolve_async(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a async autosolve?
@rbharath do we want to handle this at a function level or do we want to do this at a script level?
i'm not sure how well async functions work on jupyter notebooks

Comment thread deepretro/algorithms/autosolve.py Outdated
async def autosolve_async(
smiles_list: list[str],
*,
llm: str = "anthropic/claude-3-7-sonnet-20250219:adv",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch default model to opus 4.6

Comment thread deepretro/utils/az.py
@@ -6,22 +6,34 @@
"""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can skip changes to this file, it is being fixed in #193

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noted

Comment thread deepretro/algorithms/autosolve.py Outdated
Examples
--------
>>> from deepretro.algorithms.autosolve import autosolve # doctest: +SKIP
>>> result = autosolve("c1ccccc1") # doctest: +SKIP
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might want to move the different ways of calling autosove to rst docs

Comment thread deepretro/algorithms/autosolve.py Outdated
... hallucination_classifier="model_out/",
... ) # doctest: +SKIP
"""
run_retrosynthesis = get_pipeline()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to initialize the pipeline and then call it like a class?
can it not be called directly?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Copy Markdown
Collaborator

@ARY2260 ARY2260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left few comments, concerned about type mixing

4. Recurse on each reactant until all leaves are purchasable.
5. Flatten the tree into a step-by-step synthesis plan.

Examples
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep examples is main autosolve function docstring itself

Comment thread deepretro/algorithms/autosolve.py Outdated
``{"steps": [...], "dependencies": {...}}``.
When *return_image* is ``True``, also contains ``"image"``.
"""
import os
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep imports outside the function in general

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

Comment thread deepretro/algorithms/autosolve.py Outdated

if isinstance(classifier, (str, Path)):
clf = HallucinationClassifier()
clf.load(str(classifier))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is not a deepchem style model with restore to reload the model?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do use restore, there is a thin wrapper on top to accommodate for threshold which we calculate separately

Comment thread deepretro/algorithms/autosolve.py Outdated
from deepretro.utils.parse import format_output

hallucination_check, hallucination_checker_fn = resolve_hallucination(
hallucination_mode, hallucination_classifier,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hallucination_classifier type is quite mixed, I think its a path as well as callable? Why is this variation needed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, resolve_hallucination now returns a single callable (or None) instead of a (str, callable) tuple. The hallucination_classifier parameter accepts either a path to a saved model or a pre-loaded instance; resolve_hallucination normalises both into one ready-to-use checker function. The type variation is handled internally, not leaked to the caller.

Comment thread deepretro/algorithms/autosolve.py Outdated
depth: int = 0,
max_depth: int = 50,
) -> tuple[dict, bool]:
"""Recursively solve a molecule via AZ, falling back to LLM.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a detailed doctring explaining how the rec run works and add example.

Comment thread deepretro/algorithms/autosolve.py Outdated
Returns a nested mol/reaction tree and a *solved* flag.
"""
from rdkit import Chem
from deepretro.algorithms.llm import llm_pipeline
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please avoid internal imports, dependencies issues might creep in.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood, added

return result_dict, solved


def unsolved_leaf(smiles: str) -> dict:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there are a more of such retrosynthesis tree utils, please add them in utils file or something

Comment thread deepretro/algorithms/autosolve.py Outdated
if isinstance(pathway, list):
all_solved = True
for smi in pathway:
res, stat = rec_run(molecule=smi, **recurse_kw)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is logic type mixing for molecule ie, smi or pathway, why this mixing is required?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a case in which either a list (multiple reactants) or a single element was passed, hence the mixing. We now perform a normalising step

Comment thread deepretro/algorithms/autosolve.py Outdated


def rec_run(
molecule: str,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an AutoSolver class would be a better design to avoid parameter shuffle.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Copy Markdown
Collaborator

@ARY2260 ARY2260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor to AutoSolver class would be better.

Comment thread deepretro/algorithms/autosolve.py Outdated


class AutoSolver:
"""Holds shared config"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expand docs significantly. This needs to be the core API object not just a config holder

Comment thread deepretro/algorithms/autosolve.py Outdated
"""Convert user-facing mode to a single checker callable (or None).

Returns None (skip checking), or a callable with signature
``(product: str, pathways: list) -> (int, list)``.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document arguments and add usage examples for every function

@@ -0,0 +1,261 @@
"""Synthesis pathway visualization.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already pushed to a separate PR #195

}


def mask_protecting_groups_multisymbol(smiles: str) -> str:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shreyasvinaya can you please take this utility function in your PR and add a parameter for path of config if path is accessible to the module that will be using this function at runtime.

path can be none by default, which leads to a warning and then use the default dict PG_MAP

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protecting group will be ported over after the main llm.py is merged, it needs a thorough refactor, will include this in that refactor


import joblib

from deepretro import logging as job_context
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

part of new upcoming PR along with parse.py

import json
from pathlib import Path
from typing import Any, Dict

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@redrodeo03 Please check if this function is needed any longer after refactor. If not, delete it from the PR.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shreyasvinaya i dont think we've used langfuse in a while, are we ok to remove this?

Comment thread deepretro/utils/az.py
from aizynthfinder.aizynthfinder import AiZynthFinder

finder = AiZynthFinder(configfile=config_filename)
finder.stock.select("zinc")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put a separate small PR, and update the rst docs.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already a part of #198

from typing import Any


def build_ml_checker(clf: Any) -> Callable:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next steps:

  1. please remove hallucination_helper from models folder.
  2. in algorithm create a folder named hallucination checks, that will have the base check class with a core method "check_hallucinations(react, product)", and subs classes from this base class for heuristicHallucinationChecker() and XBBoostHallucinationChecker()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants