-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.NET Allow ISKFunction to return IAsyncEnumerable #1298
Comments
Or to put it another way, is there any way to get the gpt client to use the streaming endpoint and emit events on each message? |
We added support for streaming chat messages semi-recently, although I'm not sure this is what you are asking for: https://github.com/microsoft/semantic-kernel/blob/main/samples/dotnet/kernel-syntax-examples/Example33_StreamingChat.cs Can you be a little more specific about the types of events you are looking to consume? |
So for example in the copilot example, I would want the results as a
stream. Even if that meant subscribing to a SignalR stream. That’s ok.
Because it makes sense to have some of the data in a single payload. But I
would want to see the chat response as soon as it started generating. As
opposed to waiting for the whole thing to complete - does that make sense?
Thank you :)
…On Fri, 2 Jun 2023 at 19:37, Craig Presti ***@***.***> wrote:
We added support for streaming chat messages semi-recently, although I'm
not sure this is what you are asking for:
https://github.com/microsoft/semantic-kernel/blob/main/samples/dotnet/kernel-syntax-examples/Example33_StreamingChat.cs
Can you be a little more specific about the types of events you are
looking to consume?
—
Reply to this email directly, view it on GitHub
<#1298 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABLPDHX5UPJXNEDSRXHRFDXJIXFJANCNFSM6AAAAAAYXBZ4YU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@RogerBarreto has some ideas in the works to make the requested behavior possible. TBD - when we get those changes in, let's make sure to tag this issue. |
Happy to help out anyway I can
Thank you :)
…On Fri, 2 Jun 2023 at 23:13, Abby Harrison ***@***.***> wrote:
@RogerBarreto <https://github.com/RogerBarreto> has some ideas in the
works to make the requested behavior possible. TBD - when we get those
changes in, let's make sure to tag this issue.
—
Reply to this email directly, view it on GitHub
<#1298 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABLPDDV45RWWEFQJ27P4DLXJJQSFANCNFSM6AAAAAAYXBZ4YU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Since you don't have this yet in Semantic Kernel, let me give my view on this: I'm working on my own implementation of generative LLM integration here: https://github.com/MithrilMan/AIdentities It works, but I'm sick of a problem that arises around the use of IAsyncEnumerable: when the API fails in the middle of generating a streamed content, it fails badly because it fails when you start to consume the iterator and .Net doesn't have a nice way to handle such cases, so if you want to implement a retry pattern for example by using polly (and we've seen how this is important with OpenAI that lately had big problems serving properly their API), you can't just wrap the call to the stream method because the exception of course arise when you materialize the enumerator. Also when you stream chunks of text, what would you do if something goes wrong in the API ? I've yet not figured out how to properly and affectively handle such scenarios, in my case when it fails, fails badly, but this makes me wonder if it's worth to use streaming calls in skills at all. Of course it's cool to see that my Skill is producing a text as a stream and the user can start reading as soon as it's generating, also what I called AIdentity can generate different kind of "thoughts" that can be streemed too (think of it like a kind of log in some scanrio) but at the same time if it breaks in the middle you'll have big troubles to rollback what you eventually did with the partial generated text, without mentioning minor problems it can give if you try to parse that text on a markdown viewer for example (that's why recently in my chat section I'm building the message on a normal div and then swith to markdown once completed), and I'm starting to think that maybe all this complexity isn't worth the effort. I'm a bit torn as to whether or not implementing stream at the function level is something I'd want to deal with. Maybe just being able to signal to consumer events like "StartingTextGeneration" / "EndingTextGeneration" could be enough |
Make child of #1649 |
## Context and Problem Statement Resolves #1649 Resolves #1298 It is quite common in co-pilot implementations to have a streamlined output of messages from the LLM (large language models) and currently that is not possible while using ISKFunctions.InvokeAsync or Kernel.RunAsync methods, which enforces users to work around the Kernel and Functions to use `ITextCompletion` and `IChatCompletion` services directly as the only interfaces that currently support streaming. Currently streaming is a capability that not all providers do support and this as part of our design we try to ensure the services will have the proper abstractions to support streaming not only of text but be open to other types of data like images, audio, video, etc. Needs to be clear for the sk developer when he is attempting to get streaming data. ## Decision Drivers 1. The sk developer should be able to get streaming data from the Kernel and Functions using Kernel.RunAsync or ISKFunctions.InvokeAsync methods 2. The sk developer should be able to get the data in a generic way, so the Kernel and Functions can be able to stream data of any type, not limited to text. 3. The sk developer when using streaming from a model that does not support streaming should still be able to use it with only one streaming update representing the whole data. ## User Experience Goal ```csharp //(providing the type at as generic parameter) // Getting a Raw Streaming data from Kernel await foreach(string update in kernel.StreamingRunAsync<byte[]>(function, variables)) // Getting a String as Streaming data from Kernel await foreach(string update in kernel.StreamingRunAsync<string>(function, variables)) // Getting a StreamingResultUpdate as Streaming data from Kernel await foreach(StreamingResultUpdate update in kernel.StreamingRunAsync<StreamingResultUpdate>(variables, function)) // OR await foreach(StreamingResultUpdate update in kernel.StreamingRunAsync(function, variables)) // defaults to Generic above) { Console.WriteLine(update); } ``` ## Out of Scope - Streaming with plans will not be supported in this phase. Attempting to do so will throw an exception. - Kernel streaming will not support multiple functions (pipeline). ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [ ] I didn't break anyone 😄
Important
Labeled High because it will not require a breaking change, but it's very important to complete by v1.0.0
Is there any way currently to allow
ISKFunction
to return anIAsyncEnumerable
? If not is there any plan to?The text was updated successfully, but these errors were encountered: