Is your feature request related to a problem? Please describe.
Users have a need to take in voice notes (in indic languages or english) as questions, and generate answers based on their knowledge base + prompt and provide them back as voice notes (in same language as the voice notes). Currently the user need to take care of three sandwich API calls (stt -->rag -->tts) for each s2s call.
Describe the solution you'd like
One single API endpoint with minimal config to do s2s API call.
Additional context
Glific S2S requirements
S2S PRD