This guide is designed for automated agents and LLMs to quickly understand how to build and run the test suite for py-aperturedb locally.
The integration tests rely on local aperturedb + lenz containers orchestrated by Docker Compose. The environment runs a client container to execute the pytest suite against these spun-up backends.
To run the tests locally, you first need to build the aperturedata/aperturedb-python-tests:latest Docker image. From the workspace root, populate the expected context directory and execute docker build:
# 1. Provide the expected context data for the test Dockerfile
mkdir -p test/aperturedb/logs/runner_state
mkdir -p docker/tests/aperturedata
# 2. Copy the Python source and configurations over
cp -r aperturedb pyproject.toml README.md docker/tests/aperturedata
mkdir -m 777 -p docker/tests/aperturedata/test/aperturedb
# 3. Copy test inputs and test scripts over
cp -r test/*.py test/*.sh test/input docker/tests/aperturedata/test
# 4. Build the test image locally (BuildKit enabled for faster layer caching)
DOCKER_BUILDKIT=1 docker build -t aperturedata/aperturedb-python-tests:latest -f docker/tests/Dockerfile .Note on package versions: The Dockerfile installs CLIP first (which pulls torch 2.12.0), then installs
.[dev]extras (which pulls facenet-pytorch, downgrading torch to 2.2.2 and numpy to 1.26.4). The final installed versions are torch=2.2.2, torchvision=0.17.2, numpy=1.26.4, tensorflow=2.21.0, triton=2.2.0.
Once the test image is built, use the run_test_container.sh wrapper script located in the test/ directory to coordinate the backend integration environment and run the test suite.
cd test
set -a && source .env && set +a
bash run_test_container.shThis script:
- Orchestrates spinning up
aperturedb,nginx,lenzcontainers usingdocker-compose.ymlon temporary networking bridges. - Starts two parallel test suites (HTTP and non-HTTP modes), each in its own isolated network.
- Mounts the local
test/run_test.shinto each test container so changes to it take effect without rebuilding the image. - Checks exit codes for successes or failures and cleans up.
Expected results with a clean environment: ~118 passed + 4 skipped for non-HTTP, ~75 passed + 4 skipped for HTTP. The 4 skipped tests require the slow mark which is not selected by default.
A subset of the pytest tests interact with external objects, such as fetching blob configurations directly from cloud providers (AWS S3, Google Storage - GS). If your local environment or the automated agent lacks valid AWS/GCP credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, GCP_SERVICE_ACCOUNT_KEY), you will hit authentication validation errors when resolving storage.Client() architectures or boto3 structures during execution.
To skip these tests, set SKIP_SLOW_TESTS=true before running run_test_container.sh. This environment variable is passed into the container and automatically appends and not slow and not external_network to the pytest marker expression.
cd test
set -a && source .env && set +a
SKIP_SLOW_TESTS=true bash run_test_container.shBy setting this variable, any LLM/agent can successfully execute internal aperturedb connector + python-centric API tests without raising failures tied strictly to external cloud requirements.
These warnings appear during a successful local run and can be ignored:
PytestUnknownMarkWarningforslow,external_network,remote_credentials,tcp,http,dask— these custom marks are used but not registered inpyproject.toml. Tests still run and filter correctly.CoverageWarning: No data was collected— the--cov=aperturedbflag collects coverage data inside the container, but the HTML report step (coverage html) cannot find it. This is a known limitation of the containerized test setup and does not indicate test failures.Port 8787 is already in use(Dask) — when two test suites run in parallel and both use Dask, the scheduler port conflicts. Dask auto-reassigns to an available port; no action needed.- Pydantic V2 deprecation (
model_fieldson instance) — library internals, not caused by test code. numpy.fromstringdeprecation inaperturedb/transformers/clip.py— usenp.frombufferinstead; harmless for current numpy version.
| Aspect | Local | CI (GitHub Actions) |
|---|---|---|
| DB image | aperturedata/aperturedb-community:latest |
aperturedata/aperturedb:dev |
| Lenz tag | latest |
dev |
| Test filtering | Skips remote_credentials, external_network, S3/GCS loaders |
Runs all tests including cloud-dependent ones |
| AWS/GCP credentials | Not present — use SKIP_SLOW_TESTS=true |
Present as GitHub secrets |
| Log upload on failure | Skipped (no AWS creds) | Uploads to S3 bucket python-ci-runs |
BUILD_AUX_IMAGES |
true (default in ci.sh) — builds notebook/coverage images |
false — skips aux images to save time |
TEST_PROTOCOL |
both (default in run_test_container.sh) |
both (explicit in pr.yaml) |
RUNNER_NAME |
default (from test/.env) |
${{ runner.name }}_pr |
| Docker cache | Local layer cache only | BuildKit + --cache-from inline cache from registry |
| Image push | Never pushes | Pushes on merge to main (controlled by NO_PUSH) |
| Kaggle credentials | Dummy (KAGGLE_username=ci, KAGGLE_key=dummy) |
Same dummy values |
| Coverage HTML | coverage html fails (no data in container path) |
Same issue; coverage not collected meaningfully |
| Concurrency | N/A | New push to PR cancels in-progress run |
run_test.sh override |
Mounted as volume — edits locally apply immediately | Also mounted as volume |