Skip to content

Enable vllm for DocSum#1716

Merged
letonghan merged 17 commits into
opea-project:mainfrom
letonghan:vllm_docsum
Mar 28, 2025
Merged

Enable vllm for DocSum#1716
letonghan merged 17 commits into
opea-project:mainfrom
letonghan:vllm_docsum

Conversation

@letonghan
Copy link
Copy Markdown
Collaborator

Description

Set vllm as default llm serving, and add related docker compose files, readmes, and test scripts.

Issues

#1436

Type of change

List the type of change like below. Please delete options that are not relevant.

  • New feature (non-breaking change which adds new functionality)

Dependencies

None

Tests

local tested

Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
@letonghan letonghan requested a review from XinyaoWa as a code owner March 25, 2025 05:46
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 25, 2025

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

@letonghan letonghan requested a review from lvliang-intel March 25, 2025 05:57
letonghan and others added 2 commits March 25, 2025 14:03
letonghan and others added 4 commits March 25, 2025 22:49
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
@letonghan
Copy link
Copy Markdown
Collaborator Author

@XinyaoWa xinyao will help to check the DocSum tgi issues of max tokens/langchain dependency versions.

Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
Copy link
Copy Markdown
Collaborator

@lkk12014402 lkk12014402 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread DocSum/docker_compose/intel/hpu/gaudi/README.md Outdated
Signed-off-by: letonghan <letong.han@intel.com>
Comment thread DocSum/docker_compose/intel/hpu/gaudi/compose.yaml Outdated
Signed-off-by: letonghan <letong.han@intel.com>
@letonghan letonghan merged commit d4dcbd1 into opea-project:main Mar 28, 2025
chyundunovDatamonsters pushed a commit to chyundunovDatamonsters/OPEA-GenAIExamples that referenced this pull request Apr 1, 2025
Set vllm as default llm serving, and add related docker compose files, readmes, and test scripts.

Fix issue opea-project#1436

Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
chyundunovDatamonsters pushed a commit to chyundunovDatamonsters/OPEA-GenAIExamples that referenced this pull request Apr 1, 2025
Set vllm as default llm serving, and add related docker compose files, readmes, and test scripts.

Fix issue opea-project#1436

Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
chyundunovDatamonsters pushed a commit to chyundunovDatamonsters/OPEA-GenAIExamples that referenced this pull request May 16, 2025
Set vllm as default llm serving, and add related docker compose files, readmes, and test scripts.

Fix issue opea-project#1436

Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
letonghan pushed a commit that referenced this pull request Sep 17, 2025
* initial file structure created. Populated with unimplemented files

Signed-off-by: Madison Evans <madison.evans@intel.com>

* added relevant code to files within comps/router/deployment

Signed-off-by: Madison Evans <madison.evans@intel.com>

* added Dockerfile, opea_router_microservice.py, README.md, and requirements.txt contents

Signed-off-by: Madison Evans <madison.evans@intel.com>

* added controller components for router instances

Signed-off-by: Madison Evans <madison.evans@intel.com>

* added initial routellm controller test script in router directory

Signed-off-by: Madison Evans <madison.evans@intel.com>

* fixed requirements.txt issue

Signed-off-by: Madison Evans <madison.evans@intel.com>

* added HUGGINGFACEHUB_API_TOKEN as an env variable

Signed-off-by: Madison Evans <madison.evans@intel.com>

* removed hard OPENAI dependency and made OPENAI_API_KEY default to empty str

Signed-off-by: Madison Evans <madison.evans@intel.com>

* removed empty str fallback for OPENAI_API_KEY var

Signed-off-by: Madison Evans <madison.evans@intel.com>

* target localhost in RouteLLM E2E test to avoid Docker network issues

Signed-off-by: Madison Evans <madison.evans@intel.com>

* fixed e2e test issue for routellm test

Signed-off-by: Madison Evans <madison.evans@intel.com>

* changed the checkpoint path for the custom mf model weights. Now using 'routellm-e5-base-V2' under OPEA HF group

Signed-off-by: Madison Evans <madison.evans@intel.com>

* moved RouteEndpointDoc class into 'api_protocol.py' under cores/proto

Signed-off-by: Madison Evans <madison.evans@intel.com>

* added 'router-compose.yaml' to workflows/docker/compose

Signed-off-by: Madison Evans <madison.evans@intel.com>

* pre commit format updates

Signed-off-by: Madison Evans <madison.evans@intel.com>

* removed the forked version of RouteLLM from requirements.txt dependency. Now pulls from the referenced repo and then applies the patch located at 'comps/router/src/hf_compatibility.patch'

Signed-off-by: Madison Evans <madison.evans@intel.com>

* updated README to reflect the patch usage for modified RouteLLM repo

Signed-off-by: Madison Evans <madison.evans@intel.com>

* added H1 title to README

Signed-off-by: Madison Evans <madison.evans@intel.com>

* comply with formatting requests.

Signed-off-by: Haim Barad <haim.barad@intel.com>

* fix pre-commit issues: remove trailing whitespace and add newline

Signed-off-by: Haim Barad <haim.barad@intel.com>

---------

Signed-off-by: Madison Evans <madison.evans@intel.com>
Signed-off-by: Haim Barad <haim.barad@intel.com>
Co-authored-by: Haim Barad <haim.barad@intel.com>
cogniware-devops pushed a commit to Cogniware-Inc/GenAIExamples that referenced this pull request Dec 19, 2025
Set vllm as default llm serving, and add related docker compose files, readmes, and test scripts.

Fix issue opea-project#1436

Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: cogniware-devops <ambarish.desai@cogniware.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants