You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm writing a production server to handle requests from a large number of clients rotating. I have a custom manager class that handles everything, but I'm hoping to keep the models persistent in memory between requests. I'm trying to build so requests can specify hyper-parameters such as max_seq_len, temperature, etc. I'd prefer to do this as efficiently as possible and swap out custom parameters for each client request, as opposed to fully reloading the model from disk on every call with unique parameters.
Is there a way I can do this with the current code? If not, what would I need to refactor?
I am working with jllllll's python package fork of this repo, but the changes are minimal, so I figured it appropriate to ask this question here.
The text was updated successfully, but these errors were encountered:
I'm writing a production server to handle requests from a large number of clients rotating. I have a custom manager class that handles everything, but I'm hoping to keep the models persistent in memory between requests. I'm trying to build so requests can specify hyper-parameters such as max_seq_len, temperature, etc. I'd prefer to do this as efficiently as possible and swap out custom parameters for each client request, as opposed to fully reloading the model from disk on every call with unique parameters.
Is there a way I can do this with the current code? If not, what would I need to refactor?
I am working with jllllll's python package fork of this repo, but the changes are minimal, so I figured it appropriate to ask this question here.
The text was updated successfully, but these errors were encountered: