yt Agent Project

This project provides an agentic workflow for the yt volumetric data analysis library. It allows users to perform complex data analysis tasks using natural language queries, guided by a curated knowledge base of yt best practices.

Installation

Clone the repository:

git clone <repository-url>
cd yt-agent-project

Install dependencies: This project uses pyproject.toml. You can install it in editable mode:
```
pip install -e .
```
Note: You will need the google-genai and yt packages.
Set up your API Key: The agent uses Google's Gemini models. You need to set your API key as an environment variable:
```
export GOOGLE_API_KEY="your_api_key_here"
```

Usage

Interactive Mode

Start a chat session with the agent:

python main.py --interactive

Single Query Execution

Run a specific query and immediately execute the generated code:

python main.py "Load snapshot_001.h5 and print the field list" --execute

Training & Building the Knowledge Base

The agent's "brain" is a collection of Markdown files located in yt_agent/knowledge_base/. You can expand its capabilities by adding new topics, examples, and documentation.

1. Interactive Training

The easiest way to teach the agent a new concept is to use the interactive training mode. The agent will interview you about a topic and generate the documentation file for you.

python main.py --train

You will be prompted for:

A filename (e.g., phase_plots)
A title for the topic
A description of the concept
A code example

2. Ingesting Jupyter Notebooks

If you have existing yt analysis notebooks (.ipynb), you can automatically convert them into knowledge base entries. The tool extracts Markdown cells and Code cells to preserve the context and logic.

python main.py --ingest path/to/notebook.ipynb

You can also ingest multiple notebooks at once:

python main.py --ingest notebook1.ipynb notebook2.ipynb

3. Manual Addition

You can simply create new Markdown (.md) files in yt_agent/knowledge_base/. Ensure they follow this structure for best results:

# Topic Title

Explanation of the concept...

\`\`\`python
import yt

# Best practice code example

\`\`\`

Cost Optimization (Context Caching)

As your knowledge base grows, sending all documentation with every query can become expensive. This agent automatically uses Gemini Context Caching to minimize costs.

On startup, the agent checks if the current knowledge base matches an existing cache in your Google Cloud project.
If a match is found: It reuses the cache. You do not pay for ingestion again. You only pay for the query tokens.
If no match is found: It uploads the knowledge base and creates a new cache (valid for 2 hours).

This means repeated runs or interactive sessions share the same "brain" without re-uploading data, making it efficient for analyzing multiple datasets in sequence.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
yt_agent		yt_agent
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

yt Agent Project

Installation

Usage

Interactive Mode

Single Query Execution

Training & Building the Knowledge Base

1. Interactive Training

2. Ingesting Jupyter Notebooks

3. Manual Addition

Cost Optimization (Context Caching)

About

Uh oh!

Releases

Packages

Languages

License

data-exp-lab/yt-agent

Folders and files

Latest commit

History

Repository files navigation

yt Agent Project

Installation

Usage

Interactive Mode

Single Query Execution

Training & Building the Knowledge Base

1. Interactive Training

2. Ingesting Jupyter Notebooks

3. Manual Addition

Cost Optimization (Context Caching)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages