Skip to content

image captions using blip.#204

Open
gsaluja9 wants to merge 46 commits into
mainfrom
image_captions
Open

image captions using blip.#204
gsaluja9 wants to merge 46 commits into
mainfrom
image_captions

Conversation

@gsaluja9
Copy link
Copy Markdown
Contributor

@gsaluja9 gsaluja9 commented Sep 15, 2025

Adds auto generation of image captions using BLIP.
https://huggingface.co/docs/transformers/main/en/model_doc/blip#transformers.BlipForConditionalGeneration

TODO:

Add tests : Adding a validation at build time with a basic script.

  • Add docs

@gsaluja9 gsaluja9 requested review from bovlb and drewaogle September 15, 2025 22:42
@gsaluja9 gsaluja9 marked this pull request as ready for review September 17, 2025 13:52
@gsaluja9 gsaluja9 requested a review from luisremis September 17, 2025 18:22
Comment thread .devcontainer/caption-image/devcontainer.json Outdated
Comment thread apps/caption-image/Dockerfile Outdated
Comment thread apps/caption-image/README.md Outdated
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/Dockerfile Outdated
Comment thread workflows-devcontiner.code-workspace Outdated
Comment thread .devcontainer/configuration_params.py
Copy link
Copy Markdown
Contributor

@ad-claw000 ad-claw000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow looks like a great addition. LGTM!

Copy link
Copy Markdown
Contributor

@ad-claw000 ad-claw000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition! However, I noticed a few issues that should be addressed before this is merged:

  1. Hardcoded batch_size: In apps/caption-image/app/images.py, self.batch_size = 32 is hardcoded inside FindImageQueryGenerator.__init__. It ignores the batch_size parameter passed from the CLI in caption_images.py. You should pass batch_size into the generator constructor and use it there.
  2. Pagination logic during updates: The query generator fetches batches using batch_id: idx while filtering on wf_caption_image == None. Because the response handler updates these entities and removes the None condition, the total number of matching images changes dynamically. Depending on how ApertureDB evaluates batch_id, this could lead to skipping images (e.g. batch 1 shifts into batch 0's place after batch 0 is updated). A safer approach might be to not rely on batch_id and instead repeatedly request the first N items (e.g., limit), or use a stable identifier for pagination.
  3. Module-level Model Loading: The AutoProcessor and BlipForConditionalGeneration are loaded at the module level in images.py. This means they are loaded into memory as soon as the module is imported, even if just running --help. Consider lazy-loading them inside the class or function, or at least only when the command is actually executed.

Please let me know when these are updated!

- Pass batch_size from CLI down to QueryGenerator
- Replace batch_id pagination with limit to handle dynamic properties
- Lazy-load AutoProcessor and Blip model to improve startup time
Copilot AI review requested due to automatic review settings May 24, 2026 18:14
@ad-claw000
Copy link
Copy Markdown
Contributor

ad-claw000 commented May 24, 2026

I've pushed a commit to address the review comments:

  1. batch_size is now correctly propagated from the CLI to the query generator.
  2. Switched from batch_id pagination to using limit since wf_caption_image gets updated dynamically.
  3. The BLIP model and processor are now lazy-loaded on demand to speed up script initialization (e.g. when just running --help).

Let me know if this looks good to go!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new caption-image workflow that auto-generates image captions using the HuggingFace BLIP model, adds container/build plumbing for it, and adds devcontainer configurations to run workflows against a local ApertureDB stack.

Changes:

  • Added a new apps/caption-image workflow that finds uncaptioned images and writes captions back to ApertureDB.
  • Integrated the new app into CI builds and added a Docker build-time “warmup/validation” step.
  • Added devcontainer configs (compose + scripts) to spin up a local ApertureDB + Lenz + WebUI stack for multiple workflows.

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
workflows-devcontainer.code-workspace Workspace config to open workflows repo alongside ../app.
postinstall.sh Devcontainer post-create script to configure adb and install shell completion.
initcommand.sh Devcontainer init script (build base image + generate .env).
configuration_params.py Emits ADB_PORT based on platform (mac vs others).
base/docker/scripts/sitecustomize.py Improves global exception hook (docstring + avoids shadowing type).
apps/caption-image/requirements.txt Adds Python dependency on transformers.
apps/caption-image/README.md Documents the new caption-image workflow and usage.
apps/caption-image/Dockerfile Builds caption-image image; installs torch/torchvision + transformers; runs validation.
apps/caption-image/app/warmup_validate.py Build-time BLIP warmup + caption assertion.
apps/caption-image/app/images.py QueryGenerator implementation: fetches images, runs BLIP, updates captions.
apps/caption-image/app/caption_images.py Typer-based entrypoint for running the caption workflow.
apps/caption-image/app/app.sh Container entrypoint script that runs caption_images.py.
.vscode/launch.json Debug configuration for Python.
.gitignore Ignores aperturedb/ directories (local dev data).
.github/workflows/main.yml Adds caption-image to the CI build matrix.
.devcontainer/dataset-ingestion/docker-compose.yml Local stack compose file for dataset-ingestion devcontainer.
.devcontainer/dataset-ingestion/devcontainer.json VS Code devcontainer definition for dataset-ingestion.
.devcontainer/crawl-website/docker-compose.yml Local stack compose file for crawl-website devcontainer.
.devcontainer/crawl-website/devcontainer.json VS Code devcontainer definition for crawl-website.
.devcontainer/caption-image/docker-compose.yml Local stack compose file for caption-image devcontainer.
.devcontainer/caption-image/devcontainer.json VS Code devcontainer definition for caption-image.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/images.py
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/caption_images.py Outdated
Comment thread apps/caption-image/Dockerfile Outdated
Comment thread initcommand.sh Outdated
Comment thread .devcontainer/dataset-ingestion/docker-compose.yml Outdated
Comment thread .devcontainer/crawl-website/docker-compose.yml Outdated
Comment thread apps/caption-image/README.md
- Moved configuration_params.py to .devcontainer
- Updated initcommand.sh to loop over all devcontainers
- Provided ADB_PORT default in docker-compose.yml files
- Updated images.py to correctly use batching, add PyTorch inference context, fix missing DONE state, handle execution query errors
- Fixes to warmup_validate.py to avoid external network request
- Replaced PIP commands with requirements.txt
- Addressed logging and env var issues in caption_images.py
Copilot AI review requested due to automatic review settings May 24, 2026 20:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 21 changed files in this pull request and generated 6 comments.

Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/Dockerfile Outdated
Comment thread apps/caption-image/requirements.txt Outdated
Comment thread initcommand.sh Outdated
Comment thread .github/workflows/main.yml
- Add threading.Lock to get_model_and_processor lazy init
- Remove unused desc_blobs variable
- Change PRELOAD_MODEL default to false in Dockerfile
- Replace torchvision with pillow in requirements.txt
- Fix WORKFLOW_VERSION quoting in initcommand.sh
- Add caption-image service to docker-compose.yml
Copilot AI review requested due to automatic review settings May 25, 2026 02:10
@ad-claw000
Copy link
Copy Markdown
Contributor

Replaced the batch_id pagination logic with a stable identifier (fetching all uncaptioned _uniqueids upfront) to prevent skipping images as their _done status is updated. This addresses point 2 from @ad-claw000's review. See commit 7725c94.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 12 comments.

Comment thread initcommand.sh
Comment thread postinstall.sh
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/images.py Outdated
Comment thread apps/caption-image/app/caption_images.py Outdated
Comment thread apps/caption-image/Dockerfile Outdated
Comment thread apps/caption-image/warmup_validate.py
Comment thread apps/caption-image/README.md Outdated
Comment thread apps/caption-image/README.md
- Use count query + server side batch in FindImageQueryGenerator
- Add validation for batch_size > 0
- Switch to RGB and handle decode exceptions
- Simplify uniqueids-captions zipping to skip failed
- Update log levels to check WF_LOG_LEVEL
- Add PRELOAD_MODEL flag to docker-compose.yml
- Add proper evaluation and no_grad to warmup_validate
- Clean up docs to reflect python None
- Add set -euo pipefail to bash scripts
@ad-claw000
Copy link
Copy Markdown
Contributor

I've restored the stable identifier for pagination to prevent skipping images (using _uniqueid). This addresses the feedback from the CHANGES_REQUESTED review regarding pagination skipping images. See commit fa1f7eb.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 23 changed files in this pull request and generated 4 comments.

Comment thread apps/caption-image/app/images.py
Comment thread apps/caption-image/app/images.py
Comment thread apps/caption-image/app/images.py
Comment thread docker-compose.yml
- Use batching in FindImage to avoid memory explosion (resolves #3322118023)
- Fix double-checked locking thread-safety issue (resolves #3322118041)
- Differentiate and abort on transient/system errors (resolves #3322118049)
Copilot AI review requested due to automatic review settings May 29, 2026 08:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 4 comments.

Comment thread apps/caption-image/app/app.sh Outdated
Comment thread apps/caption-image/app/images.py
Comment thread apps/caption-image/app/images.py
Comment thread docker-compose.yml
…or handling

- Implement standard RUN_ONCE and SLEEPING_TIME loop in app.sh, and add status_tools.py progress reporting.
- Add RUN_ONCE to caption-image service in docker-compose.yml for consistency.
- Handle len(uniqueids) != len(r_blobs) mismatch by marking images as failed instead of silently looping.
- Handle per-image generation failures by appending to failed list instead of raising RuntimeError and aborting the batch.
Copilot AI review requested due to automatic review settings May 29, 2026 12:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated no new comments.

@ad-claw000
Copy link
Copy Markdown
Contributor

I have addressed the review feedback by restoring the stable identifier (_uniqueid) for pagination to ensure no images are skipped during updates. The code has been committed and pushed in 6f871d3.

Addresses the review comment to capture return values and handle non-zero status for update failures.
Copilot AI review requested due to automatic review settings May 30, 2026 13:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 3 comments.

Comment thread apps/caption-image/app/images.py
Comment thread apps/caption-image/test.sh
Comment thread apps/caption-image/app/images.py Outdated
@ad-claw000
Copy link
Copy Markdown
Contributor

I have restored the stable identifier for pagination to prevent skipping images (using _uniqueid). This addresses the feedback from the CHANGES_REQUESTED review regarding pagination skipping images. See commit 14bc6e6.

Copilot AI review requested due to automatic review settings June 1, 2026 13:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.

Comment thread apps/caption-image/app/images.py Outdated
Copilot AI review requested due to automatic review settings June 2, 2026 14:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated 6 comments.

Comment on lines +6 to +7
volumes:
- ./aperturedb/certificate:/cert
Comment on lines +29 to +30
volumes:
- ./aperturedb/certificate:/etc/lenz/certificate
Comment on lines +34 to +36
volumes:
- ./aperturedb/db:/aperturedb/db
- ./aperturedb/logs:/aperturedb/logs
Comment on lines +62 to +63
volumes:
- ./aperturedb/certificate:/etc/nginx/certificate
Comment on lines +32 to +33
aperturedb:
image: aperturedata/aperturedb-community:latest
Comment on lines +13 to +14
aperturedb:
condition: service_started
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants