兽类学报

• 论文 • 上一篇    下一篇

基于深度学习的红外相机动物影像人工智能识别:以东北虎豹国家公园为例

宫一男1 谭孟雨1 王震2 赵国静1 蒋沛林1 蒋仕铭1 张鼎基1 葛剑平1 冯利民1   

  1. (1东北虎豹生物多样性国家野外科学观测研究站,东北虎豹国家公园保护生态学国家林业和草原局重点实验室,国家林业和草原局东北虎豹监测与研究中心,东北虎豹国家公园研究院,生物多样性与生态工程教育部重点实验室,北京师范大学生态研究所,北京100875)
    (2 天津通信广播集团有限公司,天津 300140)
  • 出版日期:2019-07-30 发布日期:2019-07-25
  • 通讯作者: 冯利民 E - mail: fenglimin@bnu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目 (31270537, 31270567, 31200410, 31210103911, 31470566); 科技部基础性工作专项基金资助项目 (2012FY112000); 唐仲英基金会 (2016)

AI recognition of infrared camera image of wild animals based on deep learning: Northeast Tiger and Leopard National Park for example

GONG Yinan1, TAN Mengyu1, WANG Zhen2, ZHAO Guojing1, JIANG Peilin1, JIANG Shiming1, ZHANG Dingji1, GE Jianping1, FENG Limin1   

  1. (1 Northeast Tiger and Leopard Biodiversity National Observation and Research Station, Ministry of Education Key Laboratory for Biodiversity Science an Ecological Engineering,
    National Forestry and Grassland Administration Key Laboratory for Conservation Ecology of Northeast Tiger and Leopard National Park, National Forestry and Grassland Administration Amur tiger and Amur leopard Monitoring and Research Center, Institute of Ecology, Beijing Normal University, Beijing 100875, China)
    ( 2 Tianjin 712 Communication & Broadcasting Co., Ltd, Tianjin 300140, China)
  • Online:2019-07-30 Published:2019-07-25

摘要:

为解决大量红外相机监测影像数据量庞大、亟待快速和自动识别的问题,本研究以东北虎豹国家公园为例,应用卷积神经网络,通过深度学习算法对红外相机影像实现物种自动识别。本研究选择8个物种的红外相机视频影像,以50帧率均匀采集成图片格式,每个物种筛选不同角度、不同环境条件的图片,建立图片数据集,训练集2 074张,测试集519张。对图片进行目标打框、信息标注,选用darknet框架下的YOLO v3模型进行训练,首先不区分昼(RGB)夜(灰度)图像进行训练,再区分昼夜进行训练,最后分别对昼夜图像利用微调(fine-tune)进行训练。研究初步结果显示,基于YOLO v3模型对自然条件下拍摄的红外相机图像进行物种自动识别能够一定程度减轻人力负担,但其效果还需通过完善数据集进行提升。fine-tune在小数据集时或可作为辅助。模型对8个物种识别的平均精确率达到
84.9%~96.0%,且模型收敛。
 

关键词: 深度学习, 卷积神经网络, 微调, 自动识别, 野生动物

Abstract:

Video data of wild animals from infrared cameras always has a large quantity, which takes a lot of work to select and identify. In order to meet the demand of fast automatic identification, this study, using Northeast Tiger and Leopard National Park as an example, is to explore the practicability of using deep learning, convolutional neural networks to automatically identify different animal species, using videos taken by infrared cameras in the wild, under natural conditions. Pictures of each 8 species, captured from the videos from different seasons and of different conditions, consist of the data set. 2074 pictures for train set and 519 for test set. Region of Interest is selected and labeled, the model is YOLO v3 under darknet framework. All pictures are in one data set in the first group of experiment. In the second group, pictures are divided into day(RGB) and night(Grey), and in the third group, divided into day(RGB) and night(Grey) while fine-tuning is used. The mean average precision of our models is from 84.9% to 96.0%, and the models converge. Results show that althoug it is still needed to use better train set to improve the models, using YOLO v3 to identify wild animals automatically is practicable to save manpower and fine-tuning could be an assistance when the train set is small.

Key words: Deep learning, Convolutional Neural Network(CNN), Fine-tune, Automatic identification, Wild animals