fix(common-ai): pre-embed nodes so LlamaIndexEmbeddingOperator returns vectors by AgentNero-ch · Pull Request #68488 · apache/airflow

AgentNero-ch · 2026-06-12T23:46:15Z

What

LlamaIndexEmbeddingOperator.execute() returns chunks with "vector": None because it relies on VectorStoreIndex to populate node.embedding as a side effect. But VectorStoreIndex._get_node_with_embedding() attaches embeddings to copies of the nodes (via model_copy()), never the originals.

Fix

Call embed_model.get_text_embedding_batch() on the original nodes before passing them to VectorStoreIndex. The index's internal embed_nodes() skips nodes whose .embedding is already set, so there are no duplicate API calls.

Why this works

From llama-index-core source (indices/utils.py):
python
def embed_nodes(nodes, embed_model, ...):
for node in nodes:
if node.embedding is not None:
continue # skip already-embedded nodes
...

Verified across llama-index-core v0.10.68 through v0.14.22 — all versions copy nodes internally, so the side-effect assumption has never held.

Testing

Updated unit tests to mock get_text_embedding_batch instead of relying on VectorStoreIndex side effects. Added a new test verifying the pre-embed step is called with correct node texts.

Closes #68416

…s vectors VectorStoreIndex._get_node_with_embedding() attaches embeddings to *copies* of nodes (via model_copy()), never the originals. The operator was relying on VectorStoreIndex populating node.embedding as a side effect, which always yielded None. Fix: call embed_model.get_text_embedding_batch() on the original nodes before passing them to VectorStoreIndex. The index's internal embed_nodes() skips nodes whose .embedding is already set, so there are no duplicate API calls. Closes apache#68416

AgentNero-ch requested review from gopidesupavan and kaxil as code owners June 12, 2026 23:46

kaxil mentioned this pull request Jun 13, 2026

Fix LlamaIndexEmbeddingOperator returning vector=None for every chunk #68491

Merged

kaxil closed this in #68491 Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(common-ai): pre-embed nodes so LlamaIndexEmbeddingOperator returns vectors#68488

fix(common-ai): pre-embed nodes so LlamaIndexEmbeddingOperator returns vectors#68488
AgentNero-ch wants to merge 1 commit into
apache:mainfrom
AgentNero-ch:fix/llamaindex-embedding-vector-none

AgentNero-ch commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AgentNero-ch commented Jun 12, 2026

What

Fix

Why this works

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant