Is it possible to update the environment of a deployed hugging face model? #4236

tleyden · 2023-10-31T13:02:36Z

tleyden
Oct 31, 2023

I'm using the following code snippet to deploy a huggingface model to sagemaker, based on this blog post.

llm_image = get_huggingface_llm_image_uri(
  "huggingface",
  version="0.9.3"
)

instance_type = "ml.g5.12xlarge"
number_of_gpu = 4
health_check_timeout = 300

# Define Model and Endpoint configuration parameter
config = {
  'HF_MODEL_ID': "meta-llama/Llama-2-7b-chat-hf", # model_id from hf.co/models
  'SM_NUM_GPUS': json.dumps(number_of_gpu), # Number of GPU used per replica
  'MAX_INPUT_LENGTH': json.dumps(2048),  # Max length of input text
  'MAX_TOTAL_TOKENS': json.dumps(4096),  # Max length of the generation (including input text)
  'MAX_BATCH_TOTAL_TOKENS': json.dumps(8192),  # Limits the number of tokens that can be processed in parallel during the generation
  'HUGGING_FACE_HUB_TOKEN': os.environ.get('HUGGING_FACE_HUB_TOKEN', ""), # token to access private models
}

llm_model = HuggingFaceModel(
  role=role,
  image_uri=llm_image,
  env=config
)

llm = llm_model.deploy(
  initial_instance_count=1,
  instance_type=instance_type,
  container_startup_health_check_timeout=health_check_timeout, # 10 minutes to be able to load the model
)

This is working and I am able to do inference on the endpoint that is created.

Is there a way to update the HF_MODEL_ID environment variable from meta-llama/Llama-2-7b-chat-hf to meta-llama/Llama-2-13b-chat-hf? And if that is possible, would changing that cause the endpoint to switch to using the 13b model?

I checked the hugging face + sagemaker docs but don't see any way to update the model. Any help would be most appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to update the environment of a deployed hugging face model? #4236

{{title}}

Replies: 0 comments

Select a reply

Is it possible to update the environment of a deployed hugging face model? #4236

tleyden Oct 31, 2023

Replies: 0 comments

tleyden
Oct 31, 2023