-
-
Notifications
You must be signed in to change notification settings - Fork 16.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
yolov5 on milk-v tpu 256 #13411
Comments
👋 Hello @tcpipchip, thank you for your interest in YOLOv5 🚀! It looks like you're working with the MILK-V 256, a RISC-V processor, and encountering a segmentation fault when running your exported model. No worries, we're here to help! 😊 For 🐛 Bug Reports like this, a minimum reproducible example is crucial, and you've done a great job in providing detailed steps and descriptions! This helps us understand the issue you're facing better. An Ultralytics engineer will review your report and assist you soon. In the meantime, please verify you've set up your environment correctly: RequirementsEnsure you have Python>=3.8.0 installed with all the relevant libraries from the
EnvironmentsYOLOv5 runs smoothly in various environments like notebooks (Google Colab, Kaggle, etc.), cloud environments (Google Cloud, Amazon Web Services), or using Docker images with all dependencies pre-installed. Ensure your environment is up-to-date and configured correctly, including CUDA, cuDNN, Python, and PyTorch installations, particularly if you are leveraging GPU resources. Debugging Tips
Stay tuned, and thank you for providing a comprehensive report! 📝 If there's anything else you can share about the exact error message or log outputs, feel free to add that information here. Our team is eager to assist you further! 🚀 |
yea, requirements ok! |
@tcpipchip i'm sorry, but we can't provide private training services. However, you can follow our Train Custom Data guide to train your model. If you encounter issues, feel free to ask for help here. |
but have some tip about my problem ? |
It seems like the issue might be related to the conversion process of your custom model. Ensure your model's architecture matches the pre-trained model you successfully converted, and double-check the conversion steps for any discrepancies. |
i am investiganting now if is the image size...and testing with other pre-trainned pt of thirdy party |
Testing with different image sizes and pre-trained models is a good approach. Ensure that the input dimensions match those expected by the model, and verify compatibility with the latest YOLOv5 version. If issues persist, consider checking the model's architecture and conversion process for inconsistencies. |
Got it works, after 100 hours tryng |
@tcpipchip glad to hear you resolved your issue with YOLOv5 on the MILK-V TPU! For others who might encounter similar challenges with custom model deployment on TPU devices, I recommend checking our model export guide to understand the correct conversion steps and requirements for various hardware targets. |
Thanks. will add your link on the blog. milk-v uses a export py to onnx, looks that is the same code of your company |
Thank you for sharing your blog post! While we appreciate the mention, please note that YOLOv5's ONNX export functionality is open-source under the AGPL-3.0 license, as documented in our model export guide. We're glad you found the TPU deployment process helpful. |
Search before asking
YOLOv5 Component
No response
Bug
Hi Sir,
Recently i got the MILK-V 256, a risc-v processor.
I followed these instructions to recognize objects
https://milkv.io/docs/duo/application-development/tpu/tpu-introduction
https://milkv.io/docs/duo/application-development/tpu/tpu-docker
https://milkv.io/docs/duo/application-development/tpu/tpu-yolov5
best.zip
And works very very very nice, using the YOLOV5 with the trainned https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5n.pt
But, when i create my pt on Colab, best.pt, and convert it to execute on MILK, i always get SEGMENT FAULT
train_data.zip
attached my train on COLAB. On COLAB works, i can do the inference.
Attached too the best.pt
Environment
Yolo5, docker, all requirements ok to yolov5 master
Minimal Reproducible Example
SEGMENT FAULT
looks that my problem is on my best.pt, because the yolov5n.pt pre trainned works nice!
Additional
Sequence using the yolov5n.pt
all works fine
For more help on how to use Docker, head to https://docs.docker.com/go/guides/
ubuntu@DESKTOP-UHGFA4M:
$ docker ps$ docker run --privileged --name duotpu -v /workspace -it sophgo/tpuc_dev:v3.1CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ubuntu@DESKTOP-UHGFA4M:
docker: Error response from daemon: Conflict. The container name "/duotpu" is already in use by container "2a46fc75400fa362ed00811b4ec34bba2612506d3938b0e72f8fabab41350246". You have to remove (or rename) that container to be able to reuse that name.
See 'docker run --help'.
ubuntu@DESKTOP-UHGFA4M:
$ docker run --privileged --name duotpu -v /workspace -it sophgo/tpuc_dev:v3.1$ docker psdocker: Error response from daemon: Conflict. The container name "/duotpu" is already in use by container "2a46fc75400fa362ed00811b4ec34bba2612506d3938b0e72f8fabab41350246". You have to remove (or rename) that container to be able to reuse that name.
See 'docker run --help'.
ubuntu@DESKTOP-UHGFA4M:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2a46fc75400f sophgo/tpuc_dev:v3.1 "/bin/bash" 2 days ago Up 12 seconds duotpu
ubuntu@DESKTOP-UHGFA4M:~$ docker exec -it 2a46fc75400f /bin/bash
root@2a46fc75400f:/workspace# pytorch
bash: pytorch: command not found
root@2a46fc75400f:/workspace# ls
best.pt master tpu-mlir tpu-sdk yolov5-master yolov5n_torch
root@2a46fc75400f:/workspace# cd yolov5n_torch/
root@2a46fc75400f:/workspace/yolov5n_torch# ls
_weight_map.csv yolov5n_cv181x_int8_sym_final.mlir yolov5n_jit.pt
best.pt yolov5n_cv181x_int8_sym_model_outputs.npz yolov5n_origin.mlir
cat.jpg yolov5n_cv181x_int8_sym_tpu.mlir yolov5n_top_f32_all_origin_weight.npz
train_data yolov5n_cv181x_int8_sym_tpu_outputs.npz yolov5n_top_f32_all_weight.npz
train_data.zip yolov5n_in_f32.npz yolov5n_top_outputs.npz
work yolov5n_in_ori.npz yolov5n_tpu_addressed_cv181x_int8_sym_weight.npz
yolov5n.mlir yolov5n_int8_fuse.cvimodel yolov5n_tpu_addressed_cv181x_int8_sym_weight_fix.npz
yolov5n_cali_table yolov5n_int8_fuse_tensor_info.txt yolov5n_tpu_lowered_cv181x_int8_sym_weight.npz
root@2a46fc75400f:/workspace/yolov5n_torch# ls r*
ls: cannot access 'r*': No such file or directory
root@2a46fc75400f:/workspace/yolov5n_torch# cd ..
root@2a46fc75400f:/workspace# ls
best.pt master tpu-mlir tpu-sdk yolov5-master yolov5n_torch
root@2a46fc75400f:/workspace# cd yolov5-master/
root@2a46fc75400f:/workspace/yolov5-master# dir
CITATION.cff README.zh-CN.md data main.py segment val.py
CONTRIBUTING.md benchmarks.py detect.py models train.py yolov5n_jit.pt
LICENSE best.pt export.py pyproject.toml tutorial.ipynb
README.md classify hubconf.py requirements.txt utils
root@2a46fc75400f:/workspace/yolov5-master# cat requirements.txt
YOLOv5 requirements
Usage: pip install -r requirements.txt
Base ------------------------------------------------------------------------
gitpython>=3.1.30
matplotlib>=3.3
numpy>=1.23.5
opencv-python>=4.1.1
pillow>=10.3.0
psutil # system resources
PyYAML>=5.3.1
requests>=2.32.2
scipy>=1.4.1
thop>=0.1.1 # FLOPs computation
torch>=1.8.0 # see https://pytorch.org/get-started/locally (recommended)
torchvision>=0.9.0
tqdm>=4.66.3
ultralytics>=8.2.34 # https://ultralytics.com
protobuf<=3.20.1 # #8012
Logging ---------------------------------------------------------------------
tensorboard>=2.4.1
clearml>=1.2.0
comet
Plotting --------------------------------------------------------------------
pandas>=1.1.4
seaborn>=0.11.0
Export ----------------------------------------------------------------------
coremltools>=6.0 # CoreML export
onnx>=1.10.0 # ONNX export
onnx-simplifier>=0.4.1 # ONNX simplifier
nvidia-pyindex # TensorRT export
nvidia-tensorrt # TensorRT export
scikit-learn<=1.1.2 # CoreML quantization
tensorflow>=2.4.0,<=2.13.1 # TF exports (-cpu, -aarch64, -macos)
tensorflowjs>=3.9.0 # TF.js export
openvino-dev>=2023.0 # OpenVINO export
Deploy ----------------------------------------------------------------------
setuptools>=70.0.0 # Snyk vulnerability fix
tritonclient[all]~=2.24.0
Extras ----------------------------------------------------------------------
ipython # interactive notebook
mss # screenshots
albumentations>=1.0.3
pycocotools>=2.0.6 # COCO mAP
root@2a46fc75400f:/workspace/yolov5-master# nano requirements.txt
root@2a46fc75400f:/workspace/yolov5-master# pip install -r requirements.txt
Requirement already satisfied: gitpython>=3.1.30 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 5)) (3.1.32)
Requirement already satisfied: matplotlib>=3.3 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 6)) (3.7.2)
Requirement already satisfied: numpy>=1.23.5 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 7)) (1.24.3)
Requirement already satisfied: opencv-python>=4.1.1 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 8)) (4.8.0.74)
Requirement already satisfied: pillow>=10.3.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 9)) (11.0.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 10)) (5.9.5)
Requirement already satisfied: PyYAML>=5.3.1 in /usr/lib/python3/dist-packages (from -r requirements.txt (line 11)) (5.4.1)
Requirement already satisfied: requests>=2.32.2 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 12)) (2.32.3)
Requirement already satisfied: scipy>=1.4.1 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 13)) (1.11.1)
Requirement already satisfied: thop>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 14)) (0.1.1.post2209072238)
Requirement already satisfied: torch>=1.8.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 15)) (2.0.1+cpu)
Requirement already satisfied: torchvision>=0.9.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 16)) (0.15.2+cpu)
Requirement already satisfied: tqdm>=4.66.3 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 17)) (4.67.0)
Requirement already satisfied: ultralytics>=8.2.34 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 18)) (8.3.28)
Requirement already satisfied: pandas>=1.1.4 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 27)) (2.0.3)
Requirement already satisfied: seaborn>=0.11.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 28)) (0.13.2)
Requirement already satisfied: setuptools>=70.0.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 42)) (75.3.0)
Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.10/dist-packages (from gitpython>=3.1.30->-r requirements.txt (line 5)) (4.0.10)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (1.1.0)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (2.8.2)
Requirement already satisfied: pyparsing<3.1,>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (3.0.9)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (4.42.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (1.4.5)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (23.1)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->-r requirements.txt (line 12)) (3.4)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->-r requirements.txt (line 12)) (3.2.0)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->-r requirements.txt (line 12)) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->-r requirements.txt (line 12)) (2023.7.22)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (1.12)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (4.5.0)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (3.1.2)
Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (3.6.0)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (3.1)
Requirement already satisfied: py-cpuinfo in /usr/local/lib/python3.10/dist-packages (from ultralytics>=8.2.34->-r requirements.txt (line 18)) (9.0.0)
Requirement already satisfied: ultralytics-thop>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from ultralytics>=8.2.34->-r requirements.txt (line 18)) (2.0.11)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.4->-r requirements.txt (line 27)) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.4->-r requirements.txt (line 27)) (2023.3)
Requirement already satisfied: smmap<6,>=3.0.1 in /usr/local/lib/python3.10/dist-packages (from gitdb<5,>=4.0.1->gitpython>=3.1.30->-r requirements.txt (line 5)) (5.0.0)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.7->matplotlib>=3.3->-r requirements.txt (line 6)) (1.16.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.8.0->-r requirements.txt (line 15)) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.8.0->-r requirements.txt (line 15)) (1.3.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
root@2a46fc75400f:/workspace/yolov5-master# ls
CITATION.cff README.zh-CN.md data main.py segment val.py
CONTRIBUTING.md benchmarks.py detect.py models train.py yolov5n_jit.pt
LICENSE best.pt export.py pyproject.toml tutorial.ipynb
README.md classify hubconf.py requirements.txt utils
root@2a46fc75400f:/workspace/yolov5-master# nano main.py
root@2a46fc75400f:/workspace/yolov5-master#
root@2a46fc75400f:/workspace/yolov5-master#
root@2a46fc75400f:/workspace/yolov5-master#
root@2a46fc75400f:/workspace/yolov5-master# wget https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5n.pt
--2024-11-11 19:18:11-- https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5n.pt
Resolving github.com (github.com)... 20.201.28.151
Connecting to github.com (github.com)|20.201.28.151|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/264818686/3444cd1f-277c-414f-bdc9-3ac8ed6062df?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20241111%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241111T111811Z&X-Amz-Expires=300&X-Amz-Signature=b7761184e059f5a596b94e432bf731d13dc16857dab233d44d18080fc0f23350&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dyolov5n.pt&response-content-type=application%2Foctet-stream [following]
--2024-11-11 19:18:11-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/264818686/3444cd1f-277c-414f-bdc9-3ac8ed6062df?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20241111%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241111T111811Z&X-Amz-Expires=300&X-Amz-Signature=b7761184e059f5a596b94e432bf731d13dc16857dab233d44d18080fc0f23350&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dyolov5n.pt&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4062133 (3.9M) [application/octet-stream]
Saving to: ‘yolov5n.pt’
yolov5n.pt 100%[================================================>] 3.87M 8.31MB/s in 0.5s
2024-11-11 19:18:12 (8.31 MB/s) - ‘yolov5n.pt’ saved [4062133/4062133]
root@2a46fc75400f:/workspace/yolov5-master# cat main.py
import torch
from models.experimental import attempt_download
model = torch.load(attempt_download("./yolov5n.pt"),
map_location=torch.device('cpu'))['model'].float()
model.eval()
model.model[-1].export = True
torch.jit.trace(model, torch.rand(1, 3, 640, 640), strict=False).save('./yolov5n_jit.pt')
root@2a46fc75400f:/workspace/yolov5-master# python main.py
root@2a46fc75400f:/workspace/yolov5-master# cp yolov5n_jit.pt /workspace/yolov5-master/^C
root@2a46fc75400f:/workspace/yolov5-master# cd ..
root@2a46fc75400f:/workspace# cd yolov5n_torch
root@2a46fc75400f:/workspace/yolov5n_torch# cp /workspace/yolov5-master/yolov5n_jit.pt .
root@2a46fc75400f:/workspace/yolov5n_torch# source ./tpu-mlir/envsetup.sh
bash: ./tpu-mlir/envsetup.sh: No such file or directory
root@2a46fc75400f:/workspace/yolov5n_torch# cd ..
root@2a46fc75400f:/workspace# source ./tpu-mlir/envsetup.sh
root@2a46fc75400f:/workspace# cd yolov5n_torch/
root@2a46fc75400f:/workspace/yolov5n_torch# cp -rf ${TPUC_ROOT}/regression/dataset/COCO2017 .
root@2a46fc75400f:/workspace/yolov5n_torch# cp -rf ${TPUC_ROOT}/regression/image .
root@2a46fc75400f:/workspace/yolov5n_torch# model_transform.py \
Traceback (most recent call last):
File "/workspace/tpu-mlir/python/tools/model_transform.py", line 272, in
tool = get_model_transform(args)
File "/workspace/tpu-mlir/python/tools/model_transform.py", line 232, in get_model_transform
tool = TorchTransformer(args.model_name, args.model_def, args.input_shapes,
File "/workspace/tpu-mlir/python/tools/model_transform.py", line 204, in init
self.converter = TorchConverter(self.model_name, self.model_def, input_shapes, input_types,
File "/workspace/tpu-mlir/python/transform/TorchConverter.py", line 55, in init
self.load_torch_model(torch_file, input_shapes, input_types, output_names)
File "/workspace/tpu-mlir/python/transform/TorchConverter.py", line 251, in load_torch_model
self.model = torch.jit.load(torch_file, map_location=torch.device('cpu'))
File "/usr/local/lib/python3.10/dist-packages/torch/jit/_serialization.py", line 152, in load
raise ValueError("The provided filename {} does not exist".format(f)) # type: ignore[str-bytes-safe]
ValueError: The provided filename ../yolov5n_jit.pt does not exist
root@2a46fc75400f:/workspace/yolov5n_torch# model_transform.py \
Save mlir file: yolov5n_origin.mlir
[Running]: tpuc-opt yolov5n_origin.mlir --shape-infer --canonicalize --extra-optimize -o yolov5n.mlir
[Success]: tpuc-opt yolov5n_origin.mlir --shape-infer --canonicalize --extra-optimize -o yolov5n.mlir
Mlir file generated:yolov5n.mlir
2024/11/11 19:23:10 - INFO :
load_config Preprocess args :
resize_dims : [640, 640]
keep_aspect_ratio : True
keep_ratio_mode : letterbox
pad_value : 0
pad_type : center
input_dims : [640, 640]
--------------------------
mean : [0.0, 0.0, 0.0]
scale : [0.0039216, 0.0039216, 0.0039216]
--------------------------
pixel_format : rgb
channel_format : nchw
[CMD]: model_runner.py --input yolov5n_in_f32.npz --model ./yolov5n_jit.pt --output yolov5n_ref_outputs.npz
80: 100%|████████████████████████████████████████████████████████████████████████| 1230/1230 [00:01<00:00, 1134.76it/s]
Saving yolov5n_ref_outputs.npz
[CMD]: model_runner.py --input yolov5n_in_f32.npz --model yolov5n.mlir --output yolov5n_top_outputs.npz
[##################################################] 100%
Saving yolov5n_top_outputs.npz
[Running]: npz_tool.py compare yolov5n_top_outputs.npz yolov5n_ref_outputs.npz --tolerance 0.99,0.99 --except - -vv
compare 1249: 100%|█████████████████████████████████████████████████████████████████▋| 199/200 [00:06<00:00, 37.17it/s][x.1 ] EQUAL [PASSED]
(1, 3, 640, 640) float32
[input.62 ] SIMILAR [PASSED]
(1, 16, 320, 320) float32
cosine_similarity = 1.000000
euclidean_similarity = 0.999999
sqnr_similarity = 123.899231
[input.26 ] SIMILAR [PASSED]
(1, 16, 320, 320) float32
cosine_similarity = 1.000000
euclidean_similarity = 1.000000
sqnr_similarity = 127.972746
[103 ] SIMILAR [PASSED]
(1, 16, 320, 320) float32
cosine_similarity = 1.000000
euclidean_similarity = 1.000000
sqnr_similarity = 122.331476
[input.60 ] SIMILAR [PASSED]
...
(1, 255, 80, 80) float32
cosine_similarity = 1.000000
euclidean_similarity = 1.000000
sqnr_similarity = 119.547615
[1234 ] SIMILAR [PASSED]
(1, 255, 40, 40) float32
cosine_similarity = 1.000000
euclidean_similarity = 1.000000
sqnr_similarity = 118.485079
[1249 ] SIMILAR [PASSED]
(1, 255, 20, 20) float32
cosine_similarity = 1.000000
euclidean_similarity = 1.000000
sqnr_similarity = 118.825769
200 compared
200 passed
1 equal, 3 close, 196 similar
0 failed
0 not equal, 0 not similar
min_similiarity = (0.9999997615814209, 0.999998192529101, 114.64153289794922)
Target yolov5n_top_outputs.npz
Reference yolov5n_ref_outputs.npz
npz compare PASSED.
compare 1249: 100%|██████████████████████████████████████████████████████████████████| 200/200 [00:08<00:00, 24.98it/s]
[Success]: npz_tool.py compare yolov5n_top_outputs.npz yolov5n_ref_outputs.npz --tolerance 0.99,0.99 --except - -vv
root@2a46fc75400f:/workspace/yolov5n_torch# run_calibration.py yolov5n.mlir \
last input data (idx=100) not valid, droped
input_num = 100, ref = 100
real input_num = 100
activation_collect_and_calc_th for op: 1249: 100%|███████████████████████████████████| 200/200 [04:25<00:00, 1.33s/it]
[2048] threshold: 1249: 100%|███████████████████████████████████████████████████████| 200/200 [00:00<00:00, 235.10it/s]
GmemAllocator use OpSizeOrderAssign
reused mem is 3276800, all mem is 43767600
GmemAllocator use OpSizeOrderAssign
reused mem is 3276800, all mem is 43767600
prepare data from 100
tune op: 1249: 100%|█████████████████████████████████████████████████████████████████| 200/200 [07:13<00:00, 2.17s/it]
auto tune end, run time:433.61561346054077
root@2a46fc75400f:/workspace/yolov5n_torch# model_deploy.py
--qu> --mlir yolov5n.mlir \
Add preprocess, set the following params:
2024/11/11 19:37:39 - INFO :
_____________________________________________________
| preprocess: |
| (x - mean) * scale |
'-------------------------------------------------------'
config Preprocess args :
resize_dims : [640, 640]
keep_aspect_ratio : True
keep_ratio_mode : letterbox
pad_value : 0
pad_type : center
--------------------------
mean : [0.0, 0.0, 0.0]
scale : [1.0, 1.0, 1.0]
--------------------------
pixel_format : rgb
channel_format : nchw
[Running]: tpuc-opt yolov5n.mlir --chip-assign="chip=cv181x" --import-calibration-table="file=./yolov5n_cali_table asymmetric=False" --chip-top-optimize --fuse-preprocess="mode=INT8 customization_format=RGB_PLANAR align=False" --convert-top-to-tpu="mode=INT8 asymmetric=False linear_quant_mode=NORMAL doWinograd=False ignore_f16_overflow=False" --canonicalize -o yolov5n_cv181x_int8_sym_tpu.mlir
Entering FusePreprocessPass.
Inserting ScalelutOp.
[Success]: tpuc-opt yolov5n.mlir --chip-assign="chip=cv181x" --import-calibration-table="file=./yolov5n_cali_table asymmetric=False" --chip-top-optimize --fuse-preprocess="mode=INT8 customization_format=RGB_PLANAR align=False" --convert-top-to-tpu="mode=INT8 asymmetric=False linear_quant_mode=NORMAL doWinograd=False ignore_f16_overflow=False" --canonicalize -o yolov5n_cv181x_int8_sym_tpu.mlir
[CMD]: model_runner.py --input yolov5n_in_ori.npz --model yolov5n_cv181x_int8_sym_tpu.mlir --output yolov5n_cv181x_int8_sym_tpu_outputs.npz
[##################################################] 100%
[Running]: npz_tool.py compare yolov5n_cv181x_int8_sym_tpu_outputs.npz yolov5n_top_outputs.npz --tolerance 0.96,0.72 --except - -vv
compare 1249: 99%|█████████████████████████████████████████████████████████████████▌| 141/142 [00:05<00:00, 21.14it/s][input.26 ] SIMILAR [PASSED]
(1, 16, 320, 320) float32
cosine_similarity = 0.999769
euclidean_similarity = 0.978254
sqnr_similarity = 32.948797
[103 ] SIMILAR [PASSED]
(1, 16, 320, 320) float32
cosine_similarity = 0.999255
euclidean_similarity = 0.961272
sqnr_similarity = 24.556572
...
(1, 255, 40, 40) float32
cosine_similarity = 0.999221
euclidean_similarity = 0.959803
sqnr_similarity = 18.724862
[1249 ] SIMILAR [PASSED]
(1, 255, 20, 20) float32
cosine_similarity = 0.999214
euclidean_similarity = 0.960290
sqnr_similarity = 18.388116
142 compared
142 passed
0 equal, 0 close, 142 similar
0 failed
0 not equal, 0 not similar
min_similiarity = (0.9679524302482605, 0.7443984113616068, 11.602303981781006)
Target yolov5n_cv181x_int8_sym_tpu_outputs.npz
Reference yolov5n_top_outputs.npz
npz compare PASSED.
compare 1249: 100%|██████████████████████████████████████████████████████████████████| 142/142 [00:06<00:00, 22.79it/s]
[Success]: npz_tool.py compare yolov5n_cv181x_int8_sym_tpu_outputs.npz yolov5n_top_outputs.npz --tolerance 0.96,0.72 --except - -vv
[Running]: tpuc-opt yolov5n_cv181x_int8_sym_tpu.mlir --mlir-disable-threading --strip-io-quant="quant_input=False quant_output=False" --chip-tpu-optimize --distribute='num_device=1' --weight-reorder --subnet-divide="dynamic=False" --op-reorder --layer-group="opt=2" --parallel='num_core=1' --address-assign -o yolov5n_cv181x_int8_sym_final.mlir
==---------------------------==
Run LayerGroupSearchPass :
Searching the optimal layer groups
==---------------------------==
=======================================================
***** Dynamic Programming layer group with cluster ****
total num of base_group is 7
clusters idx(size): 0(1), 1(2), 3(2), 5(2), 7(2), 9(2), 11(2), 13(1), 14(1), 15(2), 17(2), 19(2), 21(2), 23(2), 25(2), 27(2), 29(2), 31(2), 33(1), 34(2), 36(2), 38(2), 40(2), 42(2), 44(2), 46(2), 48(2), 50(2), 52(2), 54(2), 56(1), 57(1), 58(2), 60(1), 61(2), 63(2), 65(2), 67(2), 69(2), 71(2), 73(2), 75(2), 77(2), 79(2), 81(2), 83(2), 85(2), 87(2), 89(2), 91(2), 93(1), 94(1), 95(2), 97(2), 99(2), 101(2), 103(2), 105(2), 107(2), 109(1), 110(2), 112(2), 114(2), 116(2), 118(2), 120(2), 122(1), 123(1), 124(2), 126(2), 128(2), 130(2), 132(2), 134(2), 136(1), 137(2), 139(2),
process base group 0, layer_num=141, cluster_num=77
Searching best group slices...
[#################################################] 100%
clusters idx(size): 0(1),
process base group 1, layer_num=1, cluster_num=1
clusters idx(size): 0(1),
process base group 2, layer_num=1, cluster_num=1
clusters idx(size): 0(1),
process base group 3, layer_num=1, cluster_num=1
clusters idx(size): 0(1),
process base group 4, layer_num=1, cluster_num=1
clusters idx(size): 0(1),
process base group 5, layer_num=1, cluster_num=1
clusters idx(size): 0(1),
process base group 6, layer_num=1, cluster_num=1
Consider redundant computation and gdma cost
The final cost of the two group is 1182594
//// Group cost 1182594, optimal cut idx 139
The final cost of the two group is 1116710
//// Group cost 1116710, optimal cut idx 138
The final cost of the two group is 1315164
The final cost of the two group is 970894
//// Group cost 970894, optimal cut idx 137
The final cost of the two group is 866493
//// Group cost 866493, optimal cut idx 136
The final cost of the two group is 877481
The final cost of the two group is 941308
The final cost of the two group is 892746
The pre cost of the two group is 898167
The final cost of the two group is 901710
//// Group cost 901710, optimal cut idx 132
The final cost of the two group is 832079
....
The final cost of the two group is 4092392
//// Group cost 4092392, optimal cut idx 0
Merge cut idx to reduce gdma cost
==---------------------------==
Run GroupPostTransformPass :
Some transform after layer groups is determined
==---------------------------==
==---------------------------==
Run TimeStepAssignmentPass :
Assign timestep task for each group.
==---------------------------==
==---------------------------==
Run LocalMemoryAllocationPass :
Allocate local memory for all layer groups
==---------------------------==
==---------------------------==
Run TimeStepCombinePass :
Combine time step for better parallel balance
==---------------------------==
==---------------------------==
Run GroupDataMoveOverlapPass :
Overlap data move between two layer group
==---------------------------==
GmemAllocator use OpSizeOrderAssign
[Success]: tpuc-opt yolov5n_cv181x_int8_sym_tpu.mlir --mlir-disable-threading --strip-io-quant="quant_input=False quant_output=False" --chip-tpu-optimize --distribute='num_device=1' --weight-reorder --subnet-divide="dynamic=False" --op-reorder --layer-group="opt=2" --parallel='num_core=1' --address-assign -o yolov5n_cv181x_int8_sym_final.mlir
[Running]: tpuc-opt yolov5n_cv181x_int8_sym_final.mlir --codegen="model_file=yolov5n_int8_fuse.cvimodel embed_debug_info=true model_version=latest" -o /dev/null
[oc_pos=32] cur_oc 8, stepSize 1024, compressedSize 1040, SKIP
[Success]: tpuc-opt yolov5n_cv181x_int8_sym_final.mlir --codegen="model_file=yolov5n_int8_fuse.cvimodel embed_debug_info=true model_version=latest" -o /dev/null
[CMD]: model_runner.py --input yolov5n_in_ori.npz --model yolov5n_int8_fuse.cvimodel --output yolov5n_cv181x_int8_sym_model_outputs.npz
setenv:cv181x
Start TPU Simulator for cv181x
device[0] opened, 4294967296
version: 1.4.0
yolov5n Build at 2024-11-11 19:37:51 For platform cv181x
Cmodel: bm_load_cmdbuf
Max SharedMem size:2457600
Cmodel: bm_run_cmdbuf
device[0] closed
[Running]: npz_tool.py compare yolov5n_cv181x_int8_sym_model_outputs.npz yolov5n_cv181x_int8_sym_tpu_outputs.npz --tolerance 0.99,0.90 --except - -vv
compare 1249_f32: 88%|█████████████████████████████████████████████████████████▊ | 7/8 [00:00<00:00, 69.16it/s][964 ] EQUAL [PASSED]
(1, 64, 80, 80) float32
[1081 ] EQUAL [PASSED]
(1, 128, 40, 40) float32
[input.1 ] EQUAL [PASSED]
(1, 256, 20, 20) float32
[1198 ] EQUAL [PASSED]
(1, 256, 20, 20) float32
[1219_f32 ] EQUAL [PASSED]
(1, 255, 80, 80) float32
[1234_f32 ] EQUAL [PASSED]
(1, 255, 40, 40) float32
[1249 ] EQUAL [PASSED]
(1, 255, 20, 20) float32
[1249_f32 ] EQUAL [PASSED]
(1, 255, 20, 20) float32
8 compared
8 passed
8 equal, 0 close, 0 similar
0 failed
0 not equal, 0 not similar
min_similiarity = (1.0, 1.0, inf)
Target yolov5n_cv181x_int8_sym_model_outputs.npz
Reference yolov5n_cv181x_int8_sym_tpu_outputs.npz
npz compare PASSED.
compare 1249_f32: 100%|██████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 27.38it/s]
[Success]: npz_tool.py compare yolov5n_cv181x_int8_sym_model_outputs.npz yolov5n_cv181x_int8_sym_tpu_outputs.npz --tolerance 0.99,0.90 --except - -vv
root@2a46fc75400f:/workspace/yolov5n_torch# scp -r /workspace/tpu-sdk [email protected]:/mnt/tpu/
[email protected]'s password:
OpenCVModules-release.cmake 100% 2053 402.1KB/s 00:00
haarcascade_eye.xml 100% 333KB 2.7MB/s 00:00
haarcascade_smile.xml 100% 184KB 2.7MB/s 00:00
....
libcvimath-static.a 100% 172KB 2.6MB/s 00:00
libcviruntime.so 100% 574KB 2.9MB/s 00:00
root@2a46fc75400f:/workspace/yolov5n_torch# scp /workspace/yolov5n_torch/yolov5n_int8_fuse.cvimodel [email protected]:/
mnt/tpu/tpu-sdk/
[email protected]'s password:
yolov5n_int8_fuse.cvimodel 100% 2158KB 2.9MB/s 00:00
root@2a46fc75400f:/workspace/yolov5n_torch# ls -l
total 389176
drwxr-xr-x 2 root root 4096 Nov 11 19:21 COCO2017
-rw-r--r-- 1 root root 12398 Nov 11 19:37 _weight_map.csv
-rwxr-xr-x 1 root root 14447400 Nov 9 09:00 best.pt
-rwxr-xr-x 1 root root 40717 Oct 29 07:42 cat.jpg
drwxr-xr-x 2 root root 4096 Nov 11 19:21 image
drwxr-xr-x 5 root root 4096 Nov 7 14:12 train_data
-rwxr-xr-x 1 root root 2524205 Nov 8 01:36 train_data.zip
drwxr-xr-x 2 root root 4096 Nov 9 10:02 work
-rw-r--r-- 1 root root 64711 Nov 11 19:23 yolov5n.mlir
-rw-r--r-- 1 root root 8011 Nov 11 19:35 yolov5n_cali_table
-rw-r--r-- 1 root root 2210112 Nov 11 19:37 yolov5n_int8_fuse.cvimodel
root@2a46fc75400f:/workspace/yolov5n_torch#
now using the best.pt
model_deploy.py
--mlir yolov5n.mlir
--quantize INT8
--calibration_table ./yolov5n_cali_table
--chip cv181x
--test_input ./cat.jpg
--test_reference yolov5n_top_outputs.npz
--compare_all
--fuse_preprocess
--debug
--model yolov5n_int8_fuse.cvimodel
Thank you!
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: