Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Model Context Size Parameter for Configuration and Intelligent Planner Selection #3244

Closed
KSemenenko opened this issue Oct 19, 2023 · 6 comments
Assignees
Labels
auto-closed Automatically closed

Comments

@KSemenenko
Copy link

Description:
I am proposing a new feature that involves adding a parameter to specify the model's context size within our system's configuration. This enhancement aims to allow our Planner to make intelligent model selections based on the context size requirement of a particular task.

Cost Optimization: By having the flexibility to choose the model's context size, the Planner can utilize smaller, more cost-effective models such as ChatGPT3.5 4k for shorter tasks. This approach will lead to significant cost savings. Conversely, for tasks with a larger context, the Planner can seamlessly opt for models with a larger context size, like the 16k tokens.

Also maybe it’s possible to associate certain semantic functions with a specific model?

@matthewbolanos
Copy link
Member

Thanks for raising this, @KSemenenko, we're going to discuss this a bit more internally before we decide on the best direction for solving this. We definitely agree this is something we should support.

@markwallace-microsoft markwallace-microsoft moved this to Sprint: In Progress in Semantic Kernel Oct 24, 2023
@markwallace-microsoft
Copy link
Member

@KSemenenko we are doing some work to allow the AI service and associated model request settings to be dynamically configured when a semantic function (aka LLM prompt) is executed.
We will allow for multiple different model request settings to be configured for a prompt e.g. for service identified by an id you can set different request settings (max tokens, frequency penalty, ...). The model request settings can be for an OpenAI model or any arbitrary LLM.

I'd like to get more information on you use case. Consider the following:

  • gpt-3.5-turbo has max tokens of 4,097
  • gpt-3.5-turbo-16k has max tokens of 16,385 tokens

Do you want to be able to say if the prompt token count is less then say 1000 tokens then use gpt-3.5-turbo and otherwise use gpt-3.5-turbo-16k?

@KSemenenko
Copy link
Author

@markwallace-microsoft yes, you are absolutely right, while the chat is small, there is no point in switching to 16k, especially if it costs the gpt4 model.
for optimal use of the budget.

And I have another idea, maybe we can tell the planner which models he can use for specific functions-skills-plugins.

See for example, you have a model that you have fine-tuned (or maybe train) for some specific task, and it will work perfectly with a specific function, but for other things it is not so good.

Or, for example, a task such as summarization can always work on gpt3.5, although for ordinary tasks you will find gpt4.

what do you think?

@markwallace-microsoft
Copy link
Member

@KSemenenko thanks for the feedback.

Could you take a look at this PR #3040

It includes two examples:

  1. Shows how to specify the AI service a particular prompt uses which should address your requirement to allow a prompt to be used with a specific service.
  2. Shows how to create a custom AI service selector, the sample uses token counts and the size of the prompt as the basis for deciding.

Your feedback would be much appreciated.

I need to look into how to be able to specify the AI service for a plan. Will update here when I have that information.

@KSemenenko
Copy link
Author

I like the idea of ServiceId, it also looks well with IAIServiceSelector where you choose a model.

@markwallace-microsoft markwallace-microsoft moved this from Sprint: In Progress to Sprint: Planned in Semantic Kernel Nov 3, 2023
@markwallace-microsoft markwallace-microsoft added v1.0.1 Required for the Semantic Kernel v1.0.1 release v1 bugbash labels Dec 5, 2023
@markwallace-microsoft markwallace-microsoft added vnext and removed v1 bugbash v1.0.1 Required for the Semantic Kernel v1.0.1 release labels Dec 14, 2023
@matthewbolanos matthewbolanos added kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code labels Jan 2, 2024
@github-actions github-actions bot changed the title Model Context Size Parameter for Configuration and Intelligent Planner Selection .Net: Model Context Size Parameter for Configuration and Intelligent Planner Selection Jan 2, 2024
@markwallace-microsoft
Copy link
Member

All .Net issues prior to 1-Dec-2023 are being closed. Please re-open, if this issue is still relevant to the .Net Semantic Kernel 1.x release. In the future all issues that are inactive for more than 90 days will be labelled as 'stale' and closed 14 days later.

@markwallace-microsoft markwallace-microsoft added auto-closed Automatically closed and removed .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel vnext labels Mar 12, 2024
@github-project-automation github-project-automation bot moved this from Sprint: Planned to Sprint: Done in Semantic Kernel Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-closed Automatically closed
Projects
Archived in project
Development

No branches or pull requests

5 participants