Several issues about reproducing darknet results #9

xupeng1125 · 2018-02-02T08:06:10Z

Converting darknet weights to torch
The header of darknet weights includes 4 numbers，“major, minor, revision, seen”. The type of “seen” is “size_t” , while the others are “int”. On my system, “size_t” is 8 byte, instead of 4. This may due to the attributes of the compiler. But in my case, your code gives warning as “4 bytes left”.
If I change
major, minor, revision, seen = struct.unpack('4i', f.read(16))
to
major, minor, revision = struct.unpack('3i', f.read(12))
seen = struct.unpack('Q', f.read(8))[0]
It would be fine.
The order of region layer channels.
In darknet, the order of region layer channels is “x, y, w, h, iou, probs”. In your implementation, it seems to be “iou, y, x, h, w, probs”.
I think you already noticed the problem of “iou” channel. In “convert_darknet_torch.py”, you use “transpose_weight” and “transpose_bias” to switch the order of the weights. But it would be unnecessary if you change the region layer as
iou = F.sigmoid(_feature[:, :, :, 4])
Due to the inconsistent order between “y, x, h, w” and “x,y,w,h”, I also have to use the following order to reproducing the results of darknet
center_offset = F.sigmoid(_feature[:, :, :, [1,0]])
size_norm = _feature[:, :, :, [3,2]]
The other way is to change all the “yx_min”, “yx_max” to “xy_min”, “xy_max”, and do something about the post processing. But it needs much more work.
Reproducing the darknet training results (unsolved)
I still cannot reproduce the darknet training results on my own data using your code. I think the loss is kind of weird, but I have not find the problem.
I tried several implement of YOLO by tensorflow or pytorch. None of them can reproduce the magic training result of darknet. Your implement seems to be the most promising one. I really appreciate your work and hope you can make it better.

xmfbit · 2018-02-03T06:47:27Z

@xupeng1125 The first problem. The darknet has two different versions. If minor==1, which is the early version, the size of seen is 4 bytes. And if minor==2, the size of seen is 8 bytes. You can check the minor of your model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Several issues about reproducing darknet results #9

Several issues about reproducing darknet results #9

xupeng1125 commented Feb 2, 2018

xmfbit commented Feb 3, 2018

Several issues about reproducing darknet results #9

Several issues about reproducing darknet results #9

Comments

xupeng1125 commented Feb 2, 2018

xmfbit commented Feb 3, 2018