• yolov8 与 yolov8 + tensorRT 比较 (使用pt 转 onnx 转 engine)
  • from ultralytics import YOLO
    import os

    os.chdir(os.path.dirname(os.path.abspath(__file__)))

    model = YOLO("./Weights/yolov8n.pt", task="detect")
    result = model.predict("./Image/bus.jpg", save=True)

    model = YOLO("./Weights/yolov8n.engine", task="detect")
    result = model.predict("./Image/bus.jpg", save=True)

    yolov8
    image 1/1 d:\\05.Python\\yolov8_tensorRT\\Image\\bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 44.9ms
    Speed: 2.0ms preprocess, 44.9ms inference, 53.4ms postprocess per image at shape (1, 3, 640, 480)

    yolov8 + tensorRT
    Loading Weights\\yolov8n.engine for TensorRT inference...
    [11/09/2024-18:59:07] [TRT] [I] Loaded engine size: 9 MiB
    [11/09/2024-18:59:07] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +12, now: CPU 0, GPU 18
    (MiB)
    WARNING ⚠️ Metadata not found for \'model=./Weights/yolov8n.engine\'

    image 1/1 d:\\05.Python\\yolov8_tensorRT\\Image\\bus.jpg: 640x640 4 class0s, 1 class5, 2.0ms
    Speed: 2.0ms preprocess, 2.0ms inference, 1.0ms postprocess per image at shape (1, 3, 640, 640)

    • yolov8
    • yolov8 + tensorRT
  • yolov8 + tensorRT 比较 fp32 fp16 int8 (使用pt 转 engine)
  • from ultralytics import YOLO
    import os

    os.chdir(os.path.dirname(os.path.abspath(__file__)))

    print("pt to engine fp32:\\n")
    model = YOLO("./Weights/yolov8n_fp32.engine", task="detect")
    result = model.predict("./Image/bus.jpg", save=True)

    print("pt to engine fp16:\\n")
    model = YOLO("./Weights/yolov8n_fp16.engine", task="detect")
    result = model.predict("./Image/bus.jpg", save=True)

    print("pt to engine int8:\\n")
    model = YOLO("./Weights/yolov8n_int8.engine", task="detect")
    result = model.predict("./Image/bus.jpg", save=True)

    pt to engine fp32:
    Loading Weights\\yolov8n_fp32.engine for TensorRT inference...
    [11/09/2024-21:22:58] [TRT] [I] Loaded engine size: 20 MiB
    [11/09/2024-21:22:58] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +18, now: CPU 0, GPU 34 (MiB)

    image 1/1 d:\\05.Python\\yolov8_tensorRT\\Image\\bus.jpg: 640x640 4 persons, 1 bus, 3.0ms
    Speed: 2.7ms preprocess, 3.0ms inference, 1.0ms postprocess per image at shape (1, 3, 640, 640)

    pt to engine fp16:
    Loading Weights\\yolov8n_fp16.engine for TensorRT inference...
    [11/09/2024-21:22:58] [TRT] [I] The logger passed into createInferRuntime differs
    from one already provided for an existing builder, runtime, or refitter. Uses of the global logger, returned by nvinfer1::getLogger(), will return the existing value.
    [11/09/2024-21:22:58] [TRT] [I] Loaded engine size: 9 MiB
    [11/09/2024-21:22:58] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +13, now: CPU 1, GPU 54 (MiB)

    image 1/1 d:\\05.Python\\yolov8_tensorRT\\Image\\bus.jpg: 640x640 4 persons, 1 bus, 2.0ms
    Speed: 1.5ms preprocess, 2.0ms inference, 2.0ms postprocess per image at shape (1, 3, 640, 640)

    pt to engine int8:
    Loading Weights\\yolov8n_int8.engine for TensorRT inference...
    [11/09/2024-21:22:58] [TRT] [I] The logger passed into createInferRuntime differs from one a
    one already provided for an existing builder, runtime, or refitter. Uses of the global ogrre
    logger, returned by nvinfer1::getLogger(), will return the existing value.
    [11/09/2024-21:22:58] [TRT] [I] Loaded engine size: 7 MiB
    [11/09/2024-21:22:58] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionCoionContext creation: CPU +0, GPU +2500, now: CPU 1, GPU 2557 (MiB)

    image 1/1 d:\\05.Python\\yolov8_tensorRT\\Image\\bus.jpg: 640x640 3 persons, 1 bus, 2.0ms
    Speed: 2.0ms preprocess, 2.0ms inference, 2.0ms postprocess per image at shape (1, 3, 640, 640, 640)

    • FP32:精度最佳,但推理时间较长,模型较大。
    • FP16:在推理时间和模型大小之间提供了良好的平衡。
    • INT8:模型最小,推理时间最快,但可能会有些精度损失,特别是在侦测结果上。

    参考文件:nvidia: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#python_topicsultralytics: https://docs.ultralytics.com/integrations/tensorrt/#usage