(venv) PS path\to\torchvision_inference> python .\classification.py
[QUARK-INFO]: Checking custom ops library ...
[QUARK-WARNING]: The custom ops library path\to\torchvision_inference\venv\Lib\site-packages\quark\onnx\operators\custom_ops\lib\custom_ops.dll does NOT exist.
[QUARK-INFO]: Start compiling CPU version of custom ops library.
path\to\torchvision_inference\venv\Lib\site-packages\torch\utils\cpp_extension.py:414: UserWarning: Error checking compiler version for cl: [WinError 2] 指定されたファイルが見つかりません。
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
INFO: Could not find files for the given pattern(s).
[QUARK-ERROR]: CPU version of custom ops library compilation failed:Command '['where', 'cl']' returned non-zero exit status 1.
[QUARK-WARNING]: Custom ops library compilation failed: CPU version of custom ops library compilation failed:Command '['where', 'cl']' returned non-zero exit status 1..
[QUARK-INFO]: Checked custom ops library.
Model exported to ONNX at: models\resnet50.onnx
[QUARK_INFO]: Time information:
2025-04-06 12:53:27.317460
[QUARK_INFO]: OS and CPU information:
system --- Windows
node --- gem12promax
release --- 11
version --- 10.0.26100
machine --- AMD64
processor --- AMD64 Family 25 Model 117 Stepping 2, AuthenticAMD
[QUARK_INFO]: Tools version information:
python --- 3.12.0
onnx --- 1.17.0
onnxruntime --- 1.20.1
quark.onnx --- 0.8+103c340fe2
[QUARK_INFO]: Quantized Configuration information:
model_input --- models\resnet50.onnx
model_output --- models\resnet50_quantized.onnx
calibration_data_reader --- <utils_custom.ImageDataReader object at 0x000001D0687E8410>
calibration_data_path --- None
quant_format --- QDQ
input_nodes --- []
output_nodes --- []
op_types_to_quantize --- []
extra_op_types_to_quantize --- []
per_channel --- False
reduce_range --- False
activation_type --- QUInt8
weight_type --- QInt8
nodes_to_quantize --- []
nodes_to_exclude --- []
subgraphs_to_exclude --- []
optimize_model --- True
use_external_data_format --- False
calibrate_method --- PowerOfTwoMethod.MinMSE
execution_providers --- ['CPUExecutionProvider']
enable_npu_cnn --- True
enable_npu_transformer --- False
specific_tensor_precision --- False
debug_mode --- False
convert_fp16_to_fp32 --- False
convert_nchw_to_nhwc --- False
include_cle --- False
include_sq --- False
include_rotation --- False
include_fast_ft --- False
extra_options --- {'ActivationSymmetric': True}
[QUARK-INFO]: The input ONNX model models\resnet50.onnx can create InferenceSession successfully
[QUARK-INFO]: Obtained calibration data with 60 iters
[QUARK-INFO]: Removed initializers from input
[QUARK-INFO]: Simplified model sucessfully
[QUARK-INFO]: Duplicate the shared initializers in the model for separate quantization use across different nodes!
[QUARK-INFO]: Loading model...
[QUARK-INFO]: The input ONNX model C:/Users/reg/AppData/Local/Temp/vai.cpinit.yxju0wi1/model_cpinit.onnx can run inference successfully
[QUARK-INFO]: optimize the model for better hardware compatibility.
[QUARK-WARNING]: The opset version is 13 < 17. Skipping fusing layer normalization.
[QUARK-WARNING]: The opset version is 13 < 20. Skipping fusing Gelu.
[QUARK-INFO]: Start calibration...
[QUARK-INFO]: Start collecting data, runtime depends on your model size and the number of calibration dataset.
[QUARK-INFO]: Finding optimal threshold for each tensor using PowerOfTwoMethod.MinMSE algorithm ...
[QUARK-INFO]: Use all calibration data to calculate min mse
Computing range: 56%|█████████████████████████████████▋ | 69/123 [09:00<02:58, 3.30Computing range: 57%|██████████████████████████████████▏ | 70/123 [09:03<02:51, 3.24Computing range: 58%|██████████████████████████████████▋ | 71/123 [09:09<03:31, 4.06Computing range: 59%|███████████████████████████████████ | 72/123 [09:16<03:58, 4.67Computing range: 59%|███████████████████████████████████▌ | 73/123 [09:16<02:54, 3.50Computing range: 60%|████████████████████████████████████ | 74/123 [09:43<08:38, 10.58Computing range: 61%|████████████████████████████████████▌ | 75/123 [10:11<12:29, 15.61Computing range: 62%|█████████████████████████████████████ | 76/123 [10:12<08:45, 11.17Computing range: 63%|█████████████████████████████████████▌ | 77/123 [10:23<08:44, 11.40Computing range: 63%|██████████████████████████████████████ | 78/123 [10:27<06:40, 8.90Computing range: 64%|██████████████████████████████████████▌ | 79/123 [10:30<05:14, 7.16Computing range: 66%|███████████████████████████████████████▌ | 81/123 [10:31<02:57, 4.23Computing range: 67%|████████████████████████████████████████ | 82/123 [10:35<02:44, 4.01Computing range: 67%|████████████████████████████████████████▍ | 83/123 [10:36<02:08, 3.20Computing range: 68%|████████████████████████████████████████▉ | 84/123 [10:37<01:47, 2.77Computing range: 69%|█████████████████████████████████████████▍ | 85/123 [10:43<02:21, 3.73Computing range: 70%|█████████████████████████████████████████▉ | 86/123 [10:49<02:40, 4.35Computing range: 71%|██████████████████████████████████████████▍ | 87/123 [10:51<02:07, 3.53Computing range: 72%|██████████████████████████████████████████▉ | 88/123 [10:52<01:42, 2.94Computing range: 72%|███████████████████████████████████████████▍ | 89/123 [10:55<01:40, 2.95Computing range: 73%|███████████████████████████████████████████▉ | 90/123 [10:57<01:23, 2.53Computing range: 74%|████████████████████████████████████████████▍ | 91/123 [11:00<01:24, 2.65Computing range: 75%|████████████████████████████████████████████▉ | 92/123 [11:12<02:52, 5.55Computing range: 76%|█████████████████████████████████████████████▎ | 93/123 [11:17<02:37, 5.26Computing range: 76%|█████████████████████████████████████████████▊ | 94/123 [11:23<02:37, 5.44Computing range: 77%|██████████████████████████████████████████████▎ | 95/123 [11:29<02:37, 5.61Computing range: 78%|██████████████████████████████████████████████▊ | 96/123 [11:43<03:44, 8.33Computing range: 79%|███████████████████████████████████████████████▎ | 97/123 [11:56<04:08, 9.56Computing range: 80%|███████████████████████████████████████████████▊ | 98/123 [12:02<03:31, 8.46Computing range: 80%|████████████████████████████████████████████████▎ | 99/123 [12:14<03:51, 9.65Computing range: 81%|███████████████████████████████████████████████▉ | 100/123 [12:20<03:16, 8.53Computing range: 82%|████████████████████████████████████████████████▍ | 101/123 [12:21<02:21, 6.41Computing range: 83%|████████████████████████████████████████████████▉ | 102/123 [12:27<02:10, 6.23Computing range: 84%|█████████████████████████████████████████████████▍ | 103/123 [12:30<01:45, 5.25Computing range: 85%|█████████████████████████████████████████████████▉ | 104/123 [12:33<01:26, 4.56Computing range: 86%|██████████████████████████████████████████████████▊ | 106/123 [12:39<01:04, 3.80Computing range: 87%|███████████████████████████████████████████████████▎ | 107/123 [12:45<01:11, 4.44Computing range: 88%|███████████████████████████████████████████████████▊ | 108/123 [12:47<00:55, 3.67Computing range: 89%|████████████████████████████████████████████████████▎ | 109/123 [13:16<02:29, 10.70Computing range: 89%|████████████████████████████████████████████████████▊ | 110/123 [13:17<01:43, 7.96Computing range: 90%|█████████████████████████████████████████████████████▏ | 111/123 [13:20<01:18, 6.54Computing range: 91%|█████████████████████████████████████████████████████▋ | 112/123 [13:23<01:00, 5.48Computing range: 92%|██████████████████████████████████████████████████████▏ | 113/123 [13:48<01:52, 11.25Computing range: 93%|██████████████████████████████████████████████████████▋ | 114/123 [13:51<01:19, 8.84Computing range: 93%|███████████████████████████████████████████████████████▏ | 115/123 [13:54<00:56, 7.11Computing range: 94%|███████████████████████████████████████████████████████▋ | 116/123 [14:00<00:47, 6.80Computing range: 95%|████████████████████████████████████████████████████████ | 117/123 [14:06<00:39, 6.58Computing range: 96%|████████████████████████████████████████████████████████▌ | 118/123 [14:08<00:25, 5.08Computing range: 97%|█████████████████████████████████████████████████████████ | 119/123 [14:34<00:45, 11.45Computing range: 98%|█████████████████████████████████████████████████████████▌ | 120/123 [14:36<00:25, 8.50Computing range: 98%|██████████████████████████████████████████████████████████ | 121/123 [14:42<00:15, 7.81Computing range: 99%|██████████████████████████████████████████████████████████▌| 122/123 [14:48<00:07, 7.35Computing range: 100%|███████████████████████████████████████████████████████████| 123/123 [15:16<00:00, 13.37Computing range: 100%|███████████████████████████████████████████████████████████| 123/123 [15:16<00:00, 7.45s/tensor]
[QUARK-INFO]: Finished the calibration of PowerOfTwoMethod.MinMSE which costs 1046.7s
[QUARK-INFO]: Remove QuantizeLinear & DequantizeLinear on certain operations(such as conv-relu).
[QUARK-INFO]: Rescale GlobalAveragePool /avgpool/GlobalAveragePool with factor 1.0048828125 to simulate DPU behavior.
[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints
[QUARK-INFO]: Input pos of pooling layer /avgpool/GlobalAveragePool is 1. Output pos of pooling layer /avgpool/GlobalAveragePool is 4.Modify opos from 4 to 1.
[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints
The operation types and their corresponding quantities of the input float model is shown in the table below.
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Op Type ┃ Float Model ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Conv │ 53 │
│ Relu │ 49 │
│ MaxPool │ 1 │
│ Add │ 16 │
│ GlobalAveragePool │ 1 │
│ Flatten │ 1 │
│ Gemm │ 1 │
├──────────────────────┼────────────────────────────────┤
│ Quantized model path │ models\resnet50_quantized.onnx │
└──────────────────────┴────────────────────────────────┘
The quantized information for all operation types is shown in the table below.
The discrepancy between the operation types in the quantized model and the float model is due to the application of graph optimization.
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┓
┃ Op Type ┃ Activation ┃ Weights ┃ Bias ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━┩
│ Conv │ UINT8(53) │ INT8(53) │ INT8(53) │
│ MaxPool │ UINT8(1) │ │ │
│ Add │ UINT8(16) │ │ │
│ GlobalAveragePool │ UINT8(1) │ │ │
│ Flatten │ UINT8(1) │ │ │
│ Gemm │ UINT8(1) │ INT8(1) │ INT8(1) │
└───────────────────┴────────────┴──────────┴──────────┘
Quark Quantized model saved at: models\resnet50_quantized.onnx
Image size: (224, 224)
----------------------------------------
Final top prediction is: Golden Retriever
----------------------------------------
Inference time: 42.2 ms
----------------------------------------
------------ Top 5 labels are: ----------------------------
['Golden Retriever' 'Labrador Retriever' 'Curly-coated Retriever'
'Norwich Terrier' 'Standard Poodle']
-----------------------------------------------------------
----------------------------------------
Final top prediction is: Golden Retriever
----------------------------------------
----------------------------------------
Inference time: 37.03 ms
----------------------------------------
------------ Top 5 labels are: ----------------------------
['Golden Retriever' 'Labrador Retriever' 'Curly-coated Retriever'
'Norwich Terrier' 'Standard Poodle']
-----------------------------------------------------------
APU Type: PHX/HPT
Setting environment for PHX/HPT
XLNX_VART_FIRMWARE= C:\Program Files\RyzenAI\1.4.0\voe-4.0-win_amd64\xclbins\phoenix\1x4.xclbin
NUM_OF_DPU_RUNNERS= 1
XLNX_TARGET_NAME= AMD_AIE2_Nx4_Overlay
----------------------------------------
Final top prediction is: Golden Retriever
----------------------------------------
----------------------------------------
Inference time: 38.0 ms
----------------------------------------
------------ Top 5 labels are: ----------------------------
['Golden Retriever' 'Labrador Retriever' 'Curly-coated Retriever'
'Norwich Terrier' 'Standard Poodle']
-----------------------------------------------------------
(venv) PS path\to\torchvision_inference>