You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Smooth freeze_vision_tower=false for Llava series training. But encounter this error:
Error Message
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel, and by making sure all forward function outputs participate in calculating loss.
Reminder
System Info
llamafactory
version: 0.9.2.dev0Reproduction
Script Setting
model_name_or_path: ***
stage: sft
do_train: true
finetuning_type: lora
lora_target: all
dataset: ***
template: llava
cutoff_len: 4096
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
output_dir: ***
logging_steps: 10
save_steps: 10000
plot_loss: true
overwrite_output_dir: true
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 5.0e-6
num_train_epochs: 5.0
lr_scheduler_type: cosine
warmup_ratio: 0.01
bf16: true
ddp_timeout: 180000000
lora_rank: 128
lora_alpha: 256
freeze_vision_tower: false
Expected behavior
Smooth freeze_vision_tower=false for Llava series training. But encounter this error:
Error Message
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument
find_unused_parameters=True
totorch.nn.parallel.DistributedDataParallel
, and by making sure allforward
function outputs participate in calculating loss.Others
I have checked the relevant bugs related to qwen2_VL freeze_vision_tower=false bugs and updated to latest repo https://github.com/hiyouga/LLaMA-Factory/issues/5680.
qwen2_VL works perfect under freeze_vision_tower=false, while same script fails when adapted to LLaVa1_5 series (7B, 13B).
End
Thank you for your support and contribution for this wonderful Repo in advance!
The text was updated successfully, but these errors were encountered: