You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the following code snippet to deploy a huggingface model to sagemaker, based on this blog post.
llm_image = get_huggingface_llm_image_uri(
"huggingface",
version="0.9.3"
)
instance_type = "ml.g5.12xlarge"
number_of_gpu = 4
health_check_timeout = 300
# Define Model and Endpoint configuration parameter
config = {
'HF_MODEL_ID': "meta-llama/Llama-2-7b-chat-hf", # model_id from hf.co/models
'SM_NUM_GPUS': json.dumps(number_of_gpu), # Number of GPU used per replica
'MAX_INPUT_LENGTH': json.dumps(2048), # Max length of input text
'MAX_TOTAL_TOKENS': json.dumps(4096), # Max length of the generation (including input text)
'MAX_BATCH_TOTAL_TOKENS': json.dumps(8192), # Limits the number of tokens that can be processed in parallel during the generation
'HUGGING_FACE_HUB_TOKEN': os.environ.get('HUGGING_FACE_HUB_TOKEN', ""), # token to access private models
}
llm_model = HuggingFaceModel(
role=role,
image_uri=llm_image,
env=config
)
llm = llm_model.deploy(
initial_instance_count=1,
instance_type=instance_type,
container_startup_health_check_timeout=health_check_timeout, # 10 minutes to be able to load the model
)
This is working and I am able to do inference on the endpoint that is created.
Is there a way to update the HF_MODEL_ID environment variable from meta-llama/Llama-2-7b-chat-hf to meta-llama/Llama-2-13b-chat-hf? And if that is possible, would changing that cause the endpoint to switch to using the 13b model?
I checked the hugging face + sagemaker docs but don't see any way to update the model. Any help would be most appreciated!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm using the following code snippet to deploy a huggingface model to sagemaker, based on this blog post.
This is working and I am able to do inference on the endpoint that is created.
Is there a way to update the
HF_MODEL_ID
environment variable frommeta-llama/Llama-2-7b-chat-hf
tometa-llama/Llama-2-13b-chat-hf
? And if that is possible, would changing that cause the endpoint to switch to using the 13b model?I checked the hugging face + sagemaker docs but don't see any way to update the model. Any help would be most appreciated!
Beta Was this translation helpful? Give feedback.
All reactions