Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YOLOV7-QAT] Cannot convert onnx to trt engine #53

Open
HoangTienDuc opened this issue Dec 4, 2023 · 4 comments
Open

[YOLOV7-QAT] Cannot convert onnx to trt engine #53

HoangTienDuc opened this issue Dec 4, 2023 · 4 comments

Comments

@HoangTienDuc
Copy link

Hi @wanghr323 Thank for your Yolov7 QAT.

I follow your tutorial and successful on QAT training.

Loading and preparing results...
pycocotools unable to run: Results do not correspond to current coco set
QAT Finetuning 10 / 10, Loss: 0.67706, LR: 1e-06: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 590/590 [04:10<00:00,  2.36it/s]
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 597/597 [01:20<00:00,  7.39it/s]
                 all        5963        6888       0.953       0.915       0.942       0.809

Evaluating pycocotools mAP... saving qat_models/trained_qat/pgie/1/_predictions.json...
loading annotations into memory...
Done (t=0.44s)
creating index...
index created!
Loading and preparing results...
pycocotools unable to run: Results do not correspond to current coco set

/usr/src/tensorrt/bin/trtexec --onnx=qat_models/trained_qat/pgie/1/qat.onnx --int8 --fp16 --workspace=1024000 --minShapes=images:1x3x416x416 --optShapes=images:16x3x416x416 --maxShapes=images:32x3x416x416

After that, I convert the qat trained model to onnx then convert the onnx model to trt engine. But I failed on convert onnx to trt engine.
Pls help me to check it

[12/04/2023-09:07:36] [E] [TRT] ModelImporter.cpp:729: --- End node ---
[12/04/2023-09:07:36] [E] [TRT] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:1216 In function QuantDequantLinearHelper:
[6] Assertion failed: scaleAllPositive && "Scale coefficients must all be positive"
[12/04/2023-09:07:36] [E] Failed to parse onnx file
[12/04/2023-09:07:36] [I] Finish parsing network model
[12/04/2023-09:07:36] [E] Parsing model failed
[12/04/2023-09:07:36] [E] Failed to create engine from model or file.
[12/04/2023-09:07:36] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=qat_models/trained_qat/pgie/1/qat.onnx --int8 --fp16 --workspace=1024000 --minShapes=images:4x3x416x416 --optShapes=images:4x3x416x416 --maxShapes=images:4x3x416x416
@HoangTienDuc
Copy link
Author

This is the full log

&&&& RUNNING TensorRT.trtexec [TensorRT v8500] # /usr/src/tensorrt/bin/trtexec --onnx=qat_models/trained_qat/pgie/1/qat.onnx --int8 --fp16 --workspace=1024000 --minShapes=images:4x3x416x416 --optShapes=images:4x3x416x416 --maxShapes=images:4x3x416x416
[12/04/2023-09:06:56] [W] --workspace flag has been deprecated by --memPoolSize flag.
[12/04/2023-09:06:56] [I] === Model Options ===
[12/04/2023-09:06:56] [I] Format: ONNX
[12/04/2023-09:06:56] [I] Model: qat_models/trained_qat/pgie/1/qat.onnx
[12/04/2023-09:06:56] [I] Output:
[12/04/2023-09:06:56] [I] === Build Options ===
[12/04/2023-09:06:56] [I] Max batch: explicit batch
[12/04/2023-09:06:56] [I] Memory Pools: workspace: 1.024e+06 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[12/04/2023-09:06:56] [I] minTiming: 1
[12/04/2023-09:06:56] [I] avgTiming: 8
[12/04/2023-09:06:56] [I] Precision: FP32+FP16+INT8
[12/04/2023-09:06:56] [I] LayerPrecisions: 
[12/04/2023-09:06:56] [I] Calibration: Dynamic
[12/04/2023-09:06:56] [I] Refit: Disabled
[12/04/2023-09:06:56] [I] Sparsity: Disabled
[12/04/2023-09:06:56] [I] Safe mode: Disabled
[12/04/2023-09:06:56] [I] DirectIO mode: Disabled
[12/04/2023-09:06:56] [I] Restricted mode: Disabled
[12/04/2023-09:06:56] [I] Build only: Disabled
[12/04/2023-09:06:56] [I] Save engine: 
[12/04/2023-09:06:56] [I] Load engine: 
[12/04/2023-09:06:56] [I] Profiling verbosity: 0
[12/04/2023-09:06:56] [I] Tactic sources: Using default tactic sources
[12/04/2023-09:06:56] [I] timingCacheMode: local
[12/04/2023-09:06:56] [I] timingCacheFile: 
[12/04/2023-09:06:56] [I] Heuristic: Disabled
[12/04/2023-09:06:56] [I] Preview Features: Use default preview flags.
[12/04/2023-09:06:56] [I] Input(s)s format: fp32:CHW
[12/04/2023-09:06:56] [I] Output(s)s format: fp32:CHW
[12/04/2023-09:06:56] [I] Input build shape: images=4x3x416x416+4x3x416x416+4x3x416x416
[12/04/2023-09:06:56] [I] Input calibration shapes: model
[12/04/2023-09:06:56] [I] === System Options ===
[12/04/2023-09:06:56] [I] Device: 0
[12/04/2023-09:06:56] [I] DLACore: 
[12/04/2023-09:06:56] [I] Plugins:
[12/04/2023-09:06:56] [I] === Inference Options ===
[12/04/2023-09:06:56] [I] Batch: Explicit
[12/04/2023-09:06:56] [I] Input inference shape: images=4x3x416x416
[12/04/2023-09:06:56] [I] Iterations: 10
[12/04/2023-09:06:56] [I] Duration: 3s (+ 200ms warm up)
[12/04/2023-09:06:56] [I] Sleep time: 0ms
[12/04/2023-09:06:56] [I] Idle time: 0ms
[12/04/2023-09:06:56] [I] Streams: 1
[12/04/2023-09:06:56] [I] ExposeDMA: Disabled
[12/04/2023-09:06:56] [I] Data transfers: Enabled
[12/04/2023-09:06:56] [I] Spin-wait: Disabled
[12/04/2023-09:06:56] [I] Multithreading: Disabled
[12/04/2023-09:06:56] [I] CUDA Graph: Disabled
[12/04/2023-09:06:56] [I] Separate profiling: Disabled
[12/04/2023-09:06:56] [I] Time Deserialize: Disabled
[12/04/2023-09:06:56] [I] Time Refit: Disabled
[12/04/2023-09:06:56] [I] NVTX verbosity: 0
[12/04/2023-09:06:56] [I] Persistent Cache Ratio: 0
[12/04/2023-09:06:56] [I] Inputs:
[12/04/2023-09:06:56] [I] === Reporting Options ===
[12/04/2023-09:06:56] [I] Verbose: Disabled
[12/04/2023-09:06:56] [I] Averages: 10 inferences
[12/04/2023-09:06:56] [I] Percentiles: 90,95,99
[12/04/2023-09:06:56] [I] Dump refittable layers:Disabled
[12/04/2023-09:06:56] [I] Dump output: Disabled
[12/04/2023-09:06:56] [I] Profile: Disabled
[12/04/2023-09:06:56] [I] Export timing to JSON file: 
[12/04/2023-09:06:56] [I] Export output to JSON file: 
[12/04/2023-09:06:56] [I] Export profile to JSON file: 
[12/04/2023-09:06:56] [I] 
[12/04/2023-09:06:56] [I] === Device Information ===
[12/04/2023-09:06:56] [I] Selected Device: NVIDIA GeForce RTX 3060
[12/04/2023-09:06:56] [I] Compute Capability: 8.6
[12/04/2023-09:06:56] [I] SMs: 28
[12/04/2023-09:06:56] [I] Compute Clock Rate: 1.777 GHz
[12/04/2023-09:06:56] [I] Device Global Memory: 12041 MiB
[12/04/2023-09:06:56] [I] Shared Memory per SM: 100 KiB
[12/04/2023-09:06:56] [I] Memory Bus Width: 192 bits (ECC disabled)
[12/04/2023-09:06:56] [I] Memory Clock Rate: 7.501 GHz
[12/04/2023-09:06:56] [I] 
[12/04/2023-09:06:56] [I] TensorRT version: 8.5.0
[12/04/2023-09:06:56] [I] [TRT] [MemUsageChange] Init CUDA: CPU +11, GPU +0, now: CPU 24, GPU 801 (MiB)
[12/04/2023-09:06:57] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +421, GPU +114, now: CPU 497, GPU 915 (MiB)
[12/04/2023-09:06:57] [I] Start parsing network model
[12/04/2023-09:06:58] [I] [TRT] ----------------------------------------------------------------
[12/04/2023-09:06:58] [I] [TRT] Input filename:   qat_models/trained_qat/pgie/1/qat.onnx
[12/04/2023-09:06:58] [I] [TRT] ONNX IR version:  0.0.7
[12/04/2023-09:06:58] [I] [TRT] Opset version:    13
[12/04/2023-09:06:58] [I] [TRT] Producer name:    pytorch
[12/04/2023-09:06:58] [I] [TRT] Producer version: 1.13.0
[12/04/2023-09:06:58] [I] [TRT] Domain:           
[12/04/2023-09:06:58] [I] [TRT] Model version:    0
[12/04/2023-09:06:58] [I] [TRT] Doc string:       
[12/04/2023-09:06:58] [I] [TRT] ----------------------------------------------------------------
[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:740: While parsing node number 467 [QuantizeLinear -> "onnx::DequantizeLinear_924"]:
[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:741: --- Begin node ---
[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:742: input: "model.51.cv1.conv.weight"
input: "onnx::QuantizeLinear_921"
input: "onnx::QuantizeLinear_1885"
output: "onnx::DequantizeLinear_924"
name: "QuantizeLinear_467"
op_type: "QuantizeLinear"
attribute {
  name: "axis"
  i: 0
  type: INT
}

[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:743: --- End node ---
[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:746: ERROR: builtin_op_importers.cpp:1192 In function QuantDequantLinearHelper:
[6] Assertion failed: scaleAllPositive && "Scale coefficients must all be positive"
[12/04/2023-09:06:58] [E] Failed to parse onnx file
[12/04/2023-09:06:58] [I] Finish parsing network model
[12/04/2023-09:06:58] [E] Parsing model failed
[12/04/2023-09:06:58] [E] Failed to create engine from model or file.
[12/04/2023-09:06:58] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8500] # /usr/src/tensorrt/bin/trtexec --onnx=qat_models/trained_qat/pgie/1/qat.onnx --int8 --fp16 --workspace=1024000 --minShapes=images:4x3x416x416 --optShapes=images:4x3x416x416 --maxShapes=images:4x3x416x416

@hopef
Copy link
Collaborator

hopef commented Dec 5, 2023

Scale coefficients must all be positive occurs when the stored scale value is zero. This is a bug in pytorch_quantization_library. it can be fixed by constraining the value of amax(like, amax.clamp(1e-6)) when export to onnx.
tensor_quantizer.py

@hopef
Copy link
Collaborator

hopef commented Dec 5, 2023

also, you can change the scale value using onnx in Python.

@Doctor-L-end
Copy link

@hopef 我也遇到了这个问题,但加上您在上面提到的amax.clap(1e-6)后仍然报错。
[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:771: While parsing node number 175 [QuantizeLinear -> "/model.7/conv/_weight_quantizer/QuantizeLinear_output_0"]:
[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:772: --- Begin node ---
[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:773: input: "model.7.conv.weight"
input: "/model.7/conv/_weight_quantizer/Constant_output_0"
input: "/model.7/conv/_weight_quantizer/Constant_1_output_0"
output: "/model.7/conv/_weight_quantizer/QuantizeLinear_output_0"
name: "/model.7/conv/_weight_quantizer/QuantizeLinear"
op_type: "QuantizeLinear"
attribute {
name: "axis"
i: 0
type: INT
}

[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:774: --- End node ---
[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:777: ERROR: builtin_op_importers.cpp:1197 In function QuantDequantLinearHelper:
[6] Assertion failed: scaleAllPositive && "Scale coefficients must all be positive"
[01/12/2024-11:24:30] [E] Failed to parse onnx file
[01/12/2024-11:24:30] [I] Finished parsing network model. Parse time: 0.0231789
[01/12/2024-11:24:30] [E] Parsing model failed
[01/12/2024-11:24:30] [E] Failed to create engine from model or file.
[01/12/2024-11:24:30] [E] Engine set up failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants