Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training doesn't converge #50

Open
fengziyue opened this issue Jan 26, 2021 · 4 comments
Open

Training doesn't converge #50

fengziyue opened this issue Jan 26, 2021 · 4 comments

Comments

@fengziyue
Copy link
Contributor

Hi Fangchang:

Thank you so much for sharing this great project!

I have tested your pre-trained self-supervised model, it's RMSE is around 1300, matched with your paper.
But when I try to train the model with this command:
python main.py --train-mode sparse+photo
on 2 Tesla-V100 GPU for around 15 epochs, it can only converge to RMSE ~8k-9k and never further. I didn't change any hyper parameter from your code, just the batch-size is smaller than you mentioned (8).

Are there any parameters or options I need to change from this Github repo? Or do you have any suggestions on training?

Thank you so much!

Sincerely,
Ziyue Feng

@Zoengkyun
Copy link

@fengziyue
I have the same question.
I tried batchSize =8 with 2TITAN RTX, but it doesn't converge either.Same situation as you,RMSE ~8k-9k.
then I increased the weight of photometric_loss to 1, the first epoch converged to RMSE1400 +, and later epoches diverged.
I see in his trained model file(sparse+photo), bs =16,(4TITAN RTX?).I don't have that many GPUs to experiment
Have you tried batchSize =16?

@fengziyue
Copy link
Contributor Author

fengziyue commented Apr 7, 2021 via email

@Zoengkyun
Copy link

@fangchangma
Thanks for sharing !
We can't make it converge with self-supervised
Do you have any suggestions on training?
This is very important,thank you!

@Thermaloo
Copy link

@fangchangma
I also encountered the same problem. Could you give us some responses?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants