Issue High Loss
#4180
Replies: 1 comment 1 reply
-
Which template did you use? remember to use |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently on 3a023bc, Llama-3-8B sft QLoRa on a reasonable dataset of 100k sample:
However, i'v avg loss ≈3.0, (
'loss': 3.0288, 'grad_norm': 3.2035417556762695, 'learning_rate': 3.639776304355244e-05,
) and is usually average loss of ≈1.0. Could this be a dataset effect or an issue with Llama-3-8B? While inferring, it shows minor degradation in learning.Beta Was this translation helpful? Give feedback.
All reactions