-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About _epoch_train and _epoch_val #7
Comments
hi, bro, I am trying to run the trainer.py, but I don't know about the argument "--load_model_path", there is nothing in the current folder, I am sure what kind of pretrain model need to load here, any advise? |
I think '--load_model_path' is only used when 'pretrained', but the log.txt shows error when not loading model files. |
Exactly, I got something in the logs.txt file like this : I thought program just stop here because of the error message. |
I find that it's not stopped, it's just not printed. |
Yeah, I leave it to run all night, but I found val_loss is always 0 in logs.txt, there must something wrong and need to be modified |
Because in '_epoch_val' all val loss is set to 0, you can try uncomenting the code in '_epoch_val'. But I find my train loss very large, is it the same to you? By the way, have you tried the tester |
Yes, extremely large train loss. Haven't tried the tester yet |
I have tried tester.py, not working, someplace need to convert tensor.cpu(), have you run tester.py completely? |
Yes, just convert to tensor.cpu() as the error suggested. |
However , My test results are all the Same. All my predicted captions are the same
…------------------ 原始邮件 ------------------
发件人: "Ike-yang"<notifications@github.com>;
发送时间: 2019年8月7日(星期三) 中午12:26
收件人: "ZexinYan/Medical-Report-Generation"<Medical-Report-Generation@noreply.github.com>;
抄送: "横舟"<xuwenting33@qq.com>; "Author"<author@noreply.github.com>;
主题: Re: [ZexinYan/Medical-Report-Generation] About _epoch_train and_epoch_val (#7)
I have tried tester.py, not working, someplace need to convert tensor.cpu(), have you run tester.py completely?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I have the same caption too. Can you find the reason?------------------ 原始邮件 ------------------
发件人: "xwt"<notifications@github.com>
发送时间: 2019年8月9日(星期五) 晚上9:47
收件人: "ZexinYan/Medical-Report-Generation"<Medical-Report-Generation@noreply.github.com>;
抄送: "Subscribed"<subscribed@noreply.github.com>;
主题: Re: [ZexinYan/Medical-Report-Generation] About _epoch_train and_epoch_val (#7)
However , My test results are all the Same. All my predicted captions are the same
…------------------ 原始邮件 ------------------
发件人: "Ike-yang"<notifications@github.com>;
发送时间: 2019年8月7日(星期三) 中午12:26
收件人: "ZexinYan/Medical-Report-Generation"<Medical-Report-Generation@noreply.github.com>;
抄送: "横舟"<xuwenting33@qq.com>; "Author"<author@noreply.github.com>;
主题: Re: [ZexinYan/Medical-Report-Generation] About _epoch_train and_epoch_val (#7)
I have tried tester.py, not working, someplace need to convert tensor.cpu(), have you run tester.py completely?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
not yet |
When I run
|
Did u guys met the problem like" WARNING:tensorflow:From /content/drive/Shared drives/shared drive-zma/ACL18/utils/logger.py:15: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead. Traceback (most recent call last): RuntimeError: The size of tensor a (210) must match the size of tensor b (0) at non-singleton dimension 1 |
Hi @fireholder! Did you eventually give up trying to solve the issue? were all the predicted captions always identical? |
My train loss is also very large. And all my predicted captions are the same: "No acute cardiopulmonary abnormality", could anyone do me a favor? Thx! Is it because of Python2 and Python3, since I used python3. |
Hi, you were able to decrease the loss. I am also facing the same issue. |
I am also facing the same issue. Are you able to solve this? |
I guess train loss is large, because author uses MSELoss for predicting tags. If he has 156 different tags, then the exponent ~ (156-0)^2 = 24336. That is why so big loss You can change it L1Loss or decrease lambda argument for tags loss (if you find it reasonable). |
In debugger.py and tester.py file of the given project. I'm facing an error at 3rd last line in the following given section of code.
Error is : |
Is there anybody who solve the problem predicting captions all the same? |
When i was traning, I've met a problem that the progress came to a standstill. And I've found that it was the function _epoch_train and _epoch_val stopped it, which raises NotImplementedError. I wonder why and how to fix it.
The text was updated successfully, but these errors were encountered: