微调llama3-8b的时候,eval_loss不断上升,考虑到了使用多个数据集混合,但还是没有效果,应该怎么解决? #4566
Unanswered
MemoryOldTime
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Reminder
System Info
8张Asend910A,数据集采用的alpaca_en(21.7MB)和alpaca_gpt4_en(41.3MB),利用lora技术进行混合微调
Reproduction
#!/bin/bash
NPROC_PER_NODE=8
NNODES=1
RANK=0
ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun
--nproc_per_node $NPROC_PER_NODE
--nnodes $NNODES
--node_rank $RANK
src/train.py examples/train_lora/llama3_lora_sft_ds0.yaml
Expected behavior
感觉不太像是数据集不够的原因,模型参数明显也不能改变了,不太像是正常情况,还有什么办法可以解决这种问题吗
Others
No response
Beta Was this translation helpful? Give feedback.
All reactions