overwrite training_step for CustomDPOTrainer to clear cuda cache every train step #5019

zzc0430 · 2024-07-30T07:09:39Z

according to the issue: huggingface/trl#1377, call torch.cuda.empty_cache() after each step to clear the cuda cache

What does this PR do?

reduce memory usage when using dpo trainer, according to huggingface/trl#1377

Before submitting

Did you read the contributor guideline?
Did you write any new necessary tests?

according to the issue for trl: huggingface/trl#1377, call torch.cuda.empty_cache() after each step to clear the cuda cache

enzoliao · 2024-09-09T14:29:32Z

Hi all, could you tell me why this PR is in pending?

请问下，为什么这个PR处于 pending状态呢？

overwrite training_step for CustomDPOTrainer

41d0dfc

according to the issue for trl: huggingface/trl#1377, call torch.cuda.empty_cache() after each step to clear the cuda cache

hiyouga added the pending This problem is yet to be addressed label Aug 19, 2024

hiyouga force-pushed the main branch from 5569125 to b4c7dd3 Compare October 29, 2024 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

overwrite training_step for CustomDPOTrainer to clear cuda cache every train step #5019

overwrite training_step for CustomDPOTrainer to clear cuda cache every train step #5019

zzc0430 commented Jul 30, 2024

enzoliao commented Sep 9, 2024

overwrite training_step for CustomDPOTrainer to clear cuda cache every train step #5019

Are you sure you want to change the base?

overwrite training_step for CustomDPOTrainer to clear cuda cache every train step #5019

Conversation

zzc0430 commented Jul 30, 2024

What does this PR do?

Before submitting

enzoliao commented Sep 9, 2024