Skip to content

data-exp-lab/yt-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

yt Agent Project

This project provides an agentic workflow for the yt volumetric data analysis library. It allows users to perform complex data analysis tasks using natural language queries, guided by a curated knowledge base of yt best practices.

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd yt-agent-project
  2. Install dependencies: This project uses pyproject.toml. You can install it in editable mode:

    pip install -e .

    Note: You will need the google-genai and yt packages.

  3. Set up your API Key: The agent uses Google's Gemini models. You need to set your API key as an environment variable:

    export GOOGLE_API_KEY="your_api_key_here"

Usage

Interactive Mode

Start a chat session with the agent:

python main.py --interactive

Single Query Execution

Run a specific query and immediately execute the generated code:

python main.py "Load snapshot_001.h5 and print the field list" --execute

Training & Building the Knowledge Base

The agent's "brain" is a collection of Markdown files located in yt_agent/knowledge_base/. You can expand its capabilities by adding new topics, examples, and documentation.

1. Interactive Training

The easiest way to teach the agent a new concept is to use the interactive training mode. The agent will interview you about a topic and generate the documentation file for you.

python main.py --train

You will be prompted for:

  • A filename (e.g., phase_plots)
  • A title for the topic
  • A description of the concept
  • A code example

2. Ingesting Jupyter Notebooks

If you have existing yt analysis notebooks (.ipynb), you can automatically convert them into knowledge base entries. The tool extracts Markdown cells and Code cells to preserve the context and logic.

python main.py --ingest path/to/notebook.ipynb

You can also ingest multiple notebooks at once:

python main.py --ingest notebook1.ipynb notebook2.ipynb

3. Manual Addition

You can simply create new Markdown (.md) files in yt_agent/knowledge_base/. Ensure they follow this structure for best results:

# Topic Title

Explanation of the concept...

\`\`\`python
import yt

# Best practice code example

\`\`\`

Cost Optimization (Context Caching)

As your knowledge base grows, sending all documentation with every query can become expensive. This agent automatically uses Gemini Context Caching to minimize costs.

  1. On startup, the agent checks if the current knowledge base matches an existing cache in your Google Cloud project.
  2. If a match is found: It reuses the cache. You do not pay for ingestion again. You only pay for the query tokens.
  3. If no match is found: It uploads the knowledge base and creates a new cache (valid for 2 hours).

This means repeated runs or interactive sessions share the same "brain" without re-uploading data, making it efficient for analyzing multiple datasets in sequence.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages