-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Description
When using ModelBuilder (SDK v3) with a pre-built DJL LMI container image and source_code (via SourceCode) to provide custom requirements.txt, the model directory /opt/ml/model/ becomes read-only at runtime. This prevents the DJL container from downloading models from HuggingFace Hub, which tries to write cache files to /opt/ml/model/.
Additionally, ModelBuilder overrides user-provided HF_MODEL_ID environment variable with the value from the model= parameter, making it impossible to point the container to the local model path (/opt/ml/model) when S3 model artifacts are also provided via s3_model_data_url.
How to Reproduce
from sagemaker.serve import ModelBuilder, ModelServer
from sagemaker.serve.builder.schema_builder import SchemaBuilder
from sagemaker.serve.mode.function_pointers import Mode
from sagemaker.serve.model_builder import SourceCode
source_code = SourceCode(
source_dir="./model_code",
requirements="requirements.txt", # e.g. transformers>=4.55.0
)
mb = ModelBuilder(
model="chromadb/context-1", # HF Hub model ID
role_arn=ROLE,
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/djl-inference:0.36.0-lmi22.0.0-cu129",
model_server=ModelServer.DJL_SERVING,
schema_builder=SchemaBuilder(
{"inputs": "Hello", "parameters": {"max_new_tokens": 64}},
[{"generated_text": "Hi"}],
),
source_code=source_code,
env_vars={"OPTION_TENSOR_PARALLEL_DEGREE": "4", ...},
instance_type="ml.g6e.12xlarge",
mode=Mode.SAGEMAKER_ENDPOINT,
)
model = mb.build()
endpoint = mb.deploy(endpoint_name="test", wait=True)
# FAILS: OSError: [Errno 30] Read-only file system: /opt/ml/model/models--chromadb--context-1Observed Behavior
ModelBuilder.build()packages thesource_codedirectory into amodel.tar.gzand uploads to S3- At deploy time, SageMaker mounts this tar.gz at
/opt/ml/model/— which becomes read-only ModelBuildersetsHF_MODEL_ID=chromadb/context-1(frommodel=), overriding any user-provided value- DJL LMI container sees
HF_MODEL_ID=chromadb/context-1and tries to download from HF Hub - HF Hub download tries to write cache to
/opt/ml/model/models--chromadb--context-1/ - Fails with
OSError: [Errno 30] Read-only file system
CloudWatch logs confirm:
OSError: [Errno 30] Read-only file system: /opt/ml/model/models--chromadb--context-1
Expected Behavior
Users should be able to use ModelBuilder with:
- A pre-built container image (e.g. DJL LMI)
source_codewith a customrequirements.txtto install additional dependencies at container startup- A HuggingFace Hub model ID that the container downloads at runtime
The requirements.txt installation should not make /opt/ml/model/ read-only, or the HF Hub cache should be redirected to a writable location (e.g. /tmp).
Workaround Attempted
Setting HF_HOME=/tmp/hf_home and HUGGINGFACE_HUB_CACHE=/tmp/hf_home/hub in env_vars — these appear in the container environment but the DJL container still writes to /opt/ml/model/.
Use Case
This is a common pattern for deploying newer models (e.g. OpenAI GPT-OSS based models like chromadb/context-1) that require a newer transformers version than what is bundled in the DJL LMI container. The source_code with requirements.txt is the natural SDK v3 mechanism for this, but it is incompatible with HF Hub model downloads.
Environment
- SageMaker Python SDK: 3.6.0
- Container:
djl-inference:0.36.0-lmi22.0.0-cu129 - Instance:
ml.g6e.12xlarge - Region:
us-east-1