Make Ansari LLM Generate a Non-Streamed Response #73

OdyAsh · 2024-11-08T19:51:37Z

Currently, the ansari.py is coded in a way that always anticipates litellm to return a streamed response (e.g., many "yield" statements, etc.).

We want to refactor and update this code so that it incorporates non-streamed generation as well (which can be used with certain endpoints like WhatsApp, where a message has to be sent as a whole and can't be streamed, etc.).

Additionally, this seems to be affecting the final response time to WhatsApp users. See this issue for more info.

OdyAsh added this to v2-backend-kanban-board Nov 8, 2024

OdyAsh converted this from a draft issue Nov 8, 2024

OdyAsh mentioned this issue Nov 8, 2024

WhatsApp UX Issues - Rendering #72

Open

OdyAsh moved this to Backlog in Ansari Work Nov 8, 2024

OdyAsh added this to Ansari Work Nov 8, 2024

OdyAsh mentioned this issue Nov 10, 2024

WhatsApp UX Issues - Misc. #76

Open

OdyAsh self-assigned this Nov 12, 2024

OdyAsh added the performance quality Issues related to the quality/polish/responsiveness of the app itself label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Ansari LLM Generate a Non-Streamed Response #73

Make Ansari LLM Generate a Non-Streamed Response #73

OdyAsh commented Nov 8, 2024 •

edited

Loading

Make Ansari LLM Generate a Non-Streamed Response #73

Make Ansari LLM Generate a Non-Streamed Response #73

Comments

OdyAsh commented Nov 8, 2024 • edited Loading

OdyAsh commented Nov 8, 2024 •

edited

Loading