目标检测之Tensorflow视频检测

标签：无分类：机器学习创建时间：2024-08-31 04:07:27 更新时间：2025-04-28 14:37:41

目标检测环境搭建完成之后，可以进行视频目标检测了。TensorFlow Object Detection API —— 开箱即用的目标检测API 这里提供了部分的视频检测的代码，通过读取一段视频，然后输出这里面的识别的东西。

# Tensorflow Object Detection API 视讯测试
# 载入套件
import os
import pathlib
import tensorflow as tf
import pathlib
import time
from object_detection.utils import label_map_util, config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder
import numpy as np
import cv2

# GPU 记忆体配置设定
# GPU 设定为 记忆体动态调整 (dynamic memory allocation)
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)


# 载入模型
# 下载模型，并解压缩
def download_model(model_name, model_date):
    base_url = 'http://download.tensorflow.org/models/object_detection/tf2/'
    model_file = model_name + '.tar.gz'
    # 解压缩
    model_dir = tf.keras.utils.get_file(fname=model_name,
                                        origin=base_url + model_date + '/' + model_file,
                                        untar=True)
    return str(model_dir)


MODEL_DATE = '20200711'
MODEL_NAME = 'centernet_hg104_1024x1024_coco17_tpu-32'
PATH_TO_MODEL_DIR = download_model(MODEL_NAME, MODEL_DATE)
print(PATH_TO_MODEL_DIR)

# 快速从下载的目录载入模型
# 组态档及模型档路径
PATH_TO_CFG = PATH_TO_MODEL_DIR + "/pipeline.config"
PATH_TO_CKPT = PATH_TO_MODEL_DIR + "/checkpoint"

# 计时开始
print('Loading model... ', end='')
start_time = time.time()
# 载入组态档，再建置模型
configs = config_util.get_configs_from_pipeline_file(PATH_TO_CFG)
model_config = configs['model']
detection_model = model_builder.build(model_config=model_config, is_training=False)
# 还原模型
ckpt = tf.compat.v2.train.Checkpoint(model=detection_model)
ckpt.restore(os.path.join(PATH_TO_CKPT, 'ckpt-0')).expect_partial()

# 计时完成
end_time = time.time()
elapsed_time = end_time - start_time
print(f'共花费 {elapsed_time} 秒.')


# 建立 Label 的对照表
# 下载 labels file
def download_labels(filename):
    base_url = 'https://raw.githubusercontent.com/tensorflow/models'
    base_url += '/master/research/object_detection/data/'
    label_dir = tf.keras.utils.get_file(fname=filename,
                                        origin=base_url + filename,
                                        untar=False)
    label_dir = pathlib.Path(label_dir)
    return str(label_dir)


LABEL_FILENAME = 'mscoco_label_map.pbtxt'
PATH_TO_LABELS = download_labels(LABEL_FILENAME)
print(PATH_TO_LABELS)

# 建立 Label 的对照表 (代码与名称)
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)


# 视讯物件侦测
@tf.function
def detect_fn(image):
    image, shapes = detection_model.preprocess(image)
    prediction_dict = detection_model.predict(image, shapes)
    detections = detection_model.postprocess(prediction_dict, shapes)

    return detections


# 使用 webcam
# cap = cv2.VideoCapture(0)
# 读取视讯档案
cap = cv2.VideoCapture('./data/pedestrians.mp4')
i = 0
while True:
    # 读取一帧(frame) from camera or mp4 | Capture the video frame  by frame
    ret, image_np = cap.read()

    # 加一维，变为 (笔数, 宽, 高, 颜色)
    image_np_expanded = np.expand_dims(image_np, axis=0)

    # 可测试水平翻转
    # image_np = np.fliplr(image_np).copy()

    # 可测试灰阶
    # image_np = np.tile(
    #     np.mean(image_np, 2, keepdims=True), (1, 1, 3)).astype(np.uint8)

    # 转为 TensorFlow tensor 资料型态
    input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)

    # detections：物件资讯 内含 (候选框, 类别, 机率)
    detections = detect_fn(input_tensor)
    num_detections = int(detections.pop('num_detections'))

    # 第一帧(Frame)才显示物件个数
    if i == 0:
        print(f'检测到的物件个数：{num_detections}')

    # 结果存入 detections 变数
    detections = {key: value[0, :num_detections].numpy() for key, value in detections.items()}
    detections['detection_classes'] = detections['detection_classes'].astype(int)

    # 将物件框起来
    label_id_offset = 1
    image_np_with_detections = image_np.copy()
    viz_utils.visualize_boxes_and_labels_on_image_array(
          image_np_with_detections,
          detections['detection_boxes'],
          detections['detection_classes'] + label_id_offset,
          detections['detection_scores'],
          category_index,                   # a dict containing category dictionaries keyed by category indices
          use_normalized_coordinates=True,  # whether boxes is to be interpreted as normalized coordinates or not.
          max_boxes_to_draw=200,            # maximum number of boxes to visualize.  If None, draw all boxes.
          min_score_thresh=.60,             # minimum score threshold for a box to be visualized
          agnostic_mode=False)              # boolean (default: False) controlling whether to evaluate in class-agnostic mode or not.  This mode will display scores but ignore classes.

    print("i = ", i)
    # 显示侦测结果
    img = cv2.resize(image_np_with_detections, (800, 600))
    cv2.imshow('object detection', img)

    # 存档
    i += 1
    if i == 30:
        cv2.imwrite('./data/pedestrians.png', img)

    # 按 q 可以结束 cv2 waikey()在进入下一组操作之前，会等待按下的按键事件
    if cv2.waitKey(25) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

3.卡顿问题

参考文章:
【1】.视频流卡顿–代码问题
【2】.OpenCV笔记：cv2.VideoCapture 完成视频的跳帧输出操作这里进行了跳帧操作

往期推荐

文章目录

1. 3.卡顿问题

微信公众号

广告位

诚心邀请广大金主爸爸洽谈合作

每日一省

isNaN 和 Number.isNaN 函数的区别？

1.函数 isNaN 接收参数后，会尝试将这个参数转换为数值，任何不能被转换为数值的的值都会返回 true，因此非数字值传入也会返回 true ，会影响 NaN 的判断。

2.函数 Number.isNaN 会首先判断传入参数是否为数字，如果是数字再继续判断是否为 NaN ，不会进行数据类型的转换，这种方法对于 NaN 的判断更为准确。

每日二省

为什么0.1+0.2 ! == 0.3，如何让其相等?

一个直接的解决方法就是设置一个误差范围，通常称为“机器精度”。对JavaScript来说，这个值通常为2-52，在ES6中，提供了Number.EPSILON属性，而它的值就是2-52，只要判断0.1+0.2-0.3是否小于Number.EPSILON，如果小于，就可以判断为0.1+0.2 ===0.3。

每日三省

== 操作符的强制类型转换规则？

1.首先会判断两者类型是否**相同，**相同的话就比较两者的大小。

2.类型不相同的话，就会进行类型转换。

3.会先判断是否在对比 null 和 undefined，是的话就会返回 true。

4.判断两者类型是否为 string 和 number，是的话就会将字符串转换为 number。

5.判断其中一方是否为 boolean，是的话就会把 boolean 转为 number 再进行判断。

6.判断其中一方是否为 object 且另一方为 string、number 或者 symbol，是的话就会把 object 转为原始类型再进行判断。

每日英语

Happiness is time precipitation, smile is the lonely sad.

幸福是年华的沉淀，微笑是寂寞的悲伤。