Added tests + updated docs for data prep mp3 support#51
Conversation
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
| wget https://github.com/intel/intel-extension-for-transformers/raw/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav -O ${audio_fn} | ||
|
|
||
| export audio_mp3_fn="sample_audio.mp3" | ||
| wget wget https://audio-samples.github.io/samples/mp3/blizzard_biased/sample-0.mp3 -O ${audio_mp3_fn} |
There was a problem hiding this comment.
| wget wget https://audio-samples.github.io/samples/mp3/blizzard_biased/sample-0.mp3 -O ${audio_mp3_fn} | |
| wget https://audio-samples.github.io/samples/mp3/blizzard_biased/sample-0.mp3 -O ${audio_mp3_fn} |
There was a problem hiding this comment.
Also, not sure on the licensing of this blizzard dataset. It might be safer to use another ITREX sample. For example:
wget https://github.com/intel/intel-extension-for-transformers/raw/refs/tags/v1.5/intel_extension_for_transformers/neural_chat/ui/customized/talkingbot/src/lib/components/talkbot/assets/mid-age-man.mp3
There was a problem hiding this comment.
Another idea is that instead of adding another audio file download to this example (the list of downloads is getting long), maybe the documentation should list what types of files are supported. For example, even for image ingestion, we support png, gif, and jpg, but I don't see that listed in the documentation. 🤔
| wget wget https://audio-samples.github.io/samples/mp3/blizzard_biased/sample-0.mp3 -O ${audio_mp3_fn} | ||
| ``` | ||
|
|
||
| Test dataprep microservice with generating transcript. This command updates a knowledge base by uploading a local video .mp4 and an audio .wav file. |
There was a problem hiding this comment.
Add .mp3 to this sentence (if we are keeping mp3 as part of the example)
| cd $LOG_PATH | ||
| echo "Downloading image and video" | ||
| wget https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true -O ${image_fn} | ||
| wget https://audio-samples.github.io/samples/mp3/blizzard_biased/sample-0.mp3 -O ${audio_fn} |
There was a problem hiding this comment.
Same comment about swapping this for an ITREX file
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Description
Added tests and updated documentation for mp3 ingestion feature.
Issues
opea-project#1549
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
N/A
Tests
Ran tests locally