MultimodalQnA audio features completion#1697
Closed
mhbuehler wants to merge 65 commits into
Closed
Conversation
* Added tests + updated docs for asr mp3 change Signed-off-by: okhleif-IL <omar.khleif@intel.com> * addressed review comments Signed-off-by: okhleif-IL <omar.khleif@intel.com> --------- Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
* Added logic for showing/deleting files from vector store Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added message to show when vector store is empty Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com> * Update MultimodalQnA/ui/gradio/multimodalqna_ui_gradio.py Co-authored-by: Dina Suehiro Jones <dina.s.jones@intel.com> --------- Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Dina Suehiro Jones <dina.s.jones@intel.com>
* Parameterize UI timeout and increase default Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> * Add new variable to compose.yaml Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> * Update READMEs Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> --------- Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
…ts (#58) * MultimodalQnA README and diagram updates for phase 3 enhancements Signed-off-by: dmsuehir <dina.s.jones@intel.com> * Wording Signed-off-by: dmsuehir <dina.s.jones@intel.com> * Update to remove your_* vars Signed-off-by: dmsuehir <dina.s.jones@intel.com> * Updates based on review comments Signed-off-by: dmsuehir <dina.s.jones@intel.com> --------- Signed-off-by: dmsuehir <dina.s.jones@intel.com>
* added TTS linkage to backend Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added modalities as a toggle Signed-off-by: okhleif-IL <omar.khleif@intel.com> * doc updates and code refactor Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added tts test to megaservice tests Signed-off-by: okhleif-IL <omar.khleif@intel.com> * addressed recent review comments Signed-off-by: okhleif-IL <omar.khleif@intel.com> --------- Signed-off-by: okhleif-IL <omar.khleif@intel.com>
* Add test for image and audio data ingestion Signed-off-by: dmsuehir <dina.s.jones@intel.com> * README updates Signed-off-by: dmsuehir <dina.s.jones@intel.com> * Add Gaudi tests Signed-off-by: dmsuehir <dina.s.jones@intel.com> * Add note about matching base names in test Signed-off-by: dmsuehir <dina.s.jones@intel.com> --------- Signed-off-by: dmsuehir <dina.s.jones@intel.com>
* fixed test and added tts validation Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added gaudi test, reverted -speech change Signed-off-by: okhleif-IL <omar.khleif@intel.com> --------- Signed-off-by: okhleif-IL <omar.khleif@intel.com>
…radio version Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
* Enable audio caption upload in the UI Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> * Improve handling of unsupported audio formats Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> * Improve label and exception Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> * Replace exception with error message so audio component still works Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> --------- Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
…se (#64) Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
* Add missing env vars for MMQnA UI data prep endpoints Signed-off-by: dmsuehir <dina.s.jones@intel.com> * Remove dockerfile branch Signed-off-by: dmsuehir <dina.s.jones@intel.com> --------- Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Revert Dockerfile
Adds audio response toggle button to UI
Dependency Review✅ No vulnerabilities or license issues found.Scanned Files
|
3 tasks
for more information, see https://pre-commit.ci
mhbuehler
commented
Mar 19, 2025
| RUN apt-get update && apt-get install -y --no-install-recommends git | ||
| RUN git clone --depth 1 https://github.com/opea-project/GenAIComps.git | ||
| # RUN git clone --depth 1 https://github.com/opea-project/GenAIComps.git | ||
| RUN git clone --single-branch --branch="mmqna-phase3" https://github.com/mhbuehler/GenAIComps.git |
Collaborator
Author
There was a problem hiding this comment.
This change is for testing purposes and has to be reverted before merging.
letonghan
pushed a commit
that referenced
this pull request
Sep 17, 2025
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
letonghan
pushed a commit
that referenced
this pull request
Sep 17, 2025
* add support for remote server Signed-off-by: alexsin368 <alex.sin@intel.com> * add steps to enable remote server Signed-off-by: alexsin368 <alex.sin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove use_remote_service Signed-off-by: alexsin368 <alex.sin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add OpenAI models instructions, fix format of commands Signed-off-by: alexsin368 <alex.sin@intel.com> * simplify ChatOpenAI instantiation Signed-off-by: alexsin368 <alex.sin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "simplify ChatOpenAI instantiation" This reverts commit b7c4acf7d397a284f1499254fa8832533c0c98e3. * add back check and logic for llm_engine, set openai_key argument Signed-off-by: alexsin368 <alex.sin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Provide ARCH option for lvm-video-llama image build (#1630) Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * Add sglang microservice for supporting llama4 model (#1640) Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com> Co-authored-by: Lv,Liang1 <liang1.lv@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * Remove invalid codeowner. (#1642) Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * add support for remote server Signed-off-by: alexsin368 <alex.sin@intel.com> * add steps to enable remote server Signed-off-by: alexsin368 <alex.sin@intel.com> * remove use_remote_service Signed-off-by: alexsin368 <alex.sin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: alexsin368 <alex.sin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: alexsin368 <alex.sin@intel.com> * bug fix for chunk_size and overlap cause error in dataprep ingestion (#1643) * bug fix for dataingest url Signed-off-by: Mustafa <mustafa.cetin@intel.com> * add validation function Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * validation update Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update validation function Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mustafa <mustafa.cetin@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * MariaDB Vector integrations for retriever & dataprep services (#1645) * Add MariaDB Vector third-party service MariaDB Vector was introduced since MariaDB Server 11.7 Signed-off-by: Razvan-Liviu Varzaru <razvan@mariadb.org> * Add retriever MariaDB Vector integration Signed-off-by: Razvan-Liviu Varzaru <razvan@mariadb.org> * Add dataprep MariaDB Vector integration Signed-off-by: Razvan-Liviu Varzaru <razvan@mariadb.org> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix CI failures - md5 is used for the primary key not as a security hash - fixed mariadb readme headers Signed-off-by: Razvan-Liviu Varzaru <razvan@mariadb.org> --------- Signed-off-by: Razvan-Liviu Varzaru <razvan@mariadb.org> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * update PR reviewers (#1651) Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * Expand test matrix, find all tests use 3rd party Dockerfiles (#1676) * Expand test matrix, find all tests use 3rd party Dockerfiles Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * fix the typo of README.md Comp (#1679) Update README.md for first entry of OPEA Signed-off-by: alexsin368 <alex.sin@intel.com> * Fix request handle timeout issue (#1687) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * FEAT: Enable OPEA microservices to start as MCP servers (#1635) Signed-off-by: alexsin368 <alex.sin@intel.com> * Fix huggingface_hub API upgrade issue (#1691) * Fix huggingfacehub API upgrade issue Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * add OpenAI models instructions, fix format of commands Signed-off-by: alexsin368 <alex.sin@intel.com> * Fix dataprep opensearch ingest issue (#1697) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * Fix embedding issue with ArangoDB due to deprecated HuggingFace API (#1694) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> * simplify ChatOpenAI instantiation Signed-off-by: alexsin368 <alex.sin@intel.com> * Revert "simplify ChatOpenAI instantiation" This reverts commit b7c4acf7d397a284f1499254fa8832533c0c98e3. Signed-off-by: alexsin368 <alex.sin@intel.com> * add back check and logic for llm_engine, set openai_key argument Signed-off-by: alexsin368 <alex.sin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: alexsin368 <alex.sin@intel.com> Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com> Signed-off-by: Mustafa <mustafa.cetin@intel.com> Signed-off-by: Razvan-Liviu Varzaru <razvan@mariadb.org> Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: lvliang-intel <liang1.lv@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ying Hu <ying.hu@intel.com> Co-authored-by: ZePan110 <ze.pan@intel.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Mustafa <109312699+MSCetin37@users.noreply.github.com> Co-authored-by: Razvan Liviu Varzaru <45736827+RazvanLiviuVarzaru@users.noreply.github.com> Co-authored-by: chen, suyue <suyue.chen@intel.com> Co-authored-by: Spycsh <39623753+Spycsh@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR completes the third and final phase of the RFC for MultimodalQnA image and audio support. The changes in GenAIExamples are listed below. An accompanying PR in GenAIComps is here.
New Features:
Bug Fixes:
Issues
Image and Audio Support in MultimodalQnA RFC
Type of change
Dependencies
Version upgrades:
gradio: 5.11.0 -> 5.17.1gradio_pdf: 0.0.19 -> 0.0.20Tests
Updated:
MultimodalQnA/tests/test_compose_on_gaudi.shMultimodalQnA/tests/test_compose_on_xeon.shMultimodalQnA/tests/test_compose_on_rocm.shCo-authored-by: Harsha Ramayanam harsha.ramayanam@intel.com
Co-authored-by: Melanie Buehler melanie.h.buehler@intel.com
Co-authored-by: Dina Suehiro Jones dina.s.jones@intel.com
Co-authored-by: Omar Khleif omar.khleif@intel.com