尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!


MonkeyTrail: A scalable video-based method for tracking macaque movement trajectory in daily living cages

Meng-Shi Liu Jin-Quan Gao Gu-Yue Hu Guang-Fu Hao Tian-Zi Jiang Chen Zhang Shan Yu

Meng-Shi Liu, Jin-Quan Gao, Gu-Yue Hu, Guang-Fu Hao, Tian-Zi Jiang, Chen Zhang, Shan Yu. MonkeyTrail: A scalable video-based method for tracking macaque movement trajectory in daily living cages. Zoological Research, 2022, 43(3): 343-351. doi: 10.24272/j.issn.2095-8137.2021.353
Citation: Meng-Shi Liu, Jin-Quan Gao, Gu-Yue Hu, Guang-Fu Hao, Tian-Zi Jiang, Chen Zhang, Shan Yu. MonkeyTrail: A scalable video-based method for tracking macaque movement trajectory in daily living cages. Zoological Research, 2022, 43(3): 343-351. doi: 10.24272/j.issn.2095-8137.2021.353


doi: 10.24272/j.issn.2095-8137.2021.353

MonkeyTrail: A scalable video-based method for tracking macaque movement trajectory in daily living cages

Funds: This work was supported by the National Key Research and Development Program of China (2017YFA0105203, 2017YFA0105201), National Science Foundation of China (31771076, 81925011), Strategic Priority Research Program of the Chinese Academy of Sciences (CAS) (XDB32040201), Beijing Academy of Artificial Intelligence, and Key-Area Research and Development Program of Guangdong Province (2019B030335001)
More Information
  • 摘要: 猕猴的行为分析能为神经科学研究提供重要的实验证据。近年来,自动化的动物行为视频分析受到了广泛的关注。然而,这些方法大多需要特定的实验环境以减少物体遮挡或环境变化带来的干扰,目前还缺乏能够规模化用于日常饲养条件下猕猴运动轨迹跟踪的有效手段。在该研究中,我们提出了一种新的方法(MonkeyTrail)用于实现这一目的。其关键原理是通过频繁生成的虚拟空背景,结合背景减除法准确获得包含运动中动物的前景图像。空背景生成利用了帧差法(FDM)和基于深度学习的视觉目标检测模型(YOLOv5)。整个装置由低成本的硬件构成,并可以在单笼饲养猕猴的日常环境中有效工作。为了测试这一方法的性能,我们标定了>8000帧的视频图像作为验证数据集,其中包含各种条件下的猕猴边界框数据。测试结果表明,在相同条件下,MonkeyTrail的跟踪精度和稳定性均超过了传统帧差法、背景减除法和两种基于深度学习的方法(YOLOv5和SSD)。通过对长期监控视频的分析,MonkeyTrail成功地检测到了猕猴在运动量和空间偏好方面的变化。这些结果表明,该方法可以用于实现低成本、较大规模的猕猴日常行为分析。
  • Figure  1.  Overall recording environment and camera setup

    A: One frame of recorded video, showing arrangement of monkey cages. For each recording, two cages in upper and middle positions with better visibility (marked by yellow box) were analyzed by proposed method. Position of camera in A is marked by red box. B: Diagram showing setup of recording cameras mounted on the other side of the room above cage height. Yellow and red boxes in B correspond to A.

    Figure  2.  MonkeyTrail workflow

    Figure  3.  Influence of environmental changes on efficacy of background subtraction

    A: Relationships among B–H. (C, F), (D, G), (E, H) are empty backgrounds at certain times and corresponding background subtraction results. Empty backgrounds of C, D, and E were obtained at 1 h intervals. Real-time frame B subtracted from C–E is a video frame near time of E.

    Figure  4.  Method to frequently update empty background

    Figure  5.  Background subtraction process with generated empty background

    A: One video frame showing typical situation in daily living cage. B: Background subtraction between A and virtual empty background generated temporally close to A, thus highlighting foreground containing animal. C: Image processing result of B, after spatial median filtering, binarizing, eroding, and dilating. A and B are redrawn from Figure 3B and H, respectively.

    Figure  6.  Representative tracking results for macaque during daytime and nighttime

    Green box and blue line represent bounding box and trajectory, respectively. Sequence of frames is from left to right, then top to bottom. Time interval between each frame is >10 s. These examples include different motions and various levels of occlusion.

    Figure  7.  Visualization of performance in generating bounding boxes by different methods

    Results of several trajectory tracking methods were compared with results of manual annotation to calculate accuracy. IoU, which measures accuracy of bounding box, was plotted for individual frames concatenated in time. A–E: Results of MonkeyTrail, SSD, YOLOv5, BSM, and FDM, with IoU shown in different colors. Green dashed line indicates mean value of IoU for MonkeyTrail, and red dashed lines represent mean values of IoU for corresponding methods. F: Amount of motion (calculated by length of trajectory movement) was plotted with the same time frame as in A–E. Gray box represents time when macaque is occluded by parts of cage. Data were from three monkeys, including 8130 frames.

    Figure  8.  Tracking success rates with systematically varying overlap thresholds for different tracking methods

    Success rate is percentage of total number of frames with IoU values greater than predefined threshold. Number in square brackets indicates average IoU value.

    Figure  9.  Daily total activity patterns of two monkeys (A and B) captured with MonkeyTrail

    Blue and red columns represent results obtained in 2019 and 2020, respectively. Average activity counts in each hourly time segment were obtained from 5-day recordings.

    Figure  10.  Spatial preference of macaques extracted by MonkeyTrail

    A, B/C, D, results of monkeys A/B obtained in 2019 and 2020, respectively. Horizontal and vertical axes of heat map represent X and Y coordinates of cage, respectively. Each heat map region represents number of times macaque’s trajectory passed through this space, normalized by maximum number found in one region (color-coded). Each heat map was obtained by averaging trajectory data of five days.

  • [1] Bala PC, Eisenreich BR, Yoo SBM, Hayden BY, Park HS, Zimmermann J. 2020. Automated markerless pose estimation in freely moving macaques with OpenMonkeyStudio. Nature Communications, 11(1): 4560. doi: 10.1038/s41467-020-18441-5
    [2] Ballesta S, Reymond G, Pozzobon M, Duhamel JR. 2014. A real-time 3D video tracking system for monitoring primate groups. Journal of Neuroscience Methods, 234: 147−152. doi: 10.1016/j.jneumeth.2014.05.022
    [3] Bateson M, Martin PR. 2021. Measuring Behaviour: an Introductory Guide. 4th ed. Cambridge: Cambridge University Press.
    [4] Beckman D, Morrison JH. 2021. Towards developing a rhesus monkey model of early Alzheimer's disease focusing on women's health. American Journal of Primatology, 83(11): e23289.
    [5] Bezard E, Dovero S, Prunier C, Ravenscroft P, Chalon S, Guilloteau D, et al. 2001. Relationship between the appearance of symptoms and the level of nigrostriatal degeneration in a progressive 1-methyl-4-phenyl-1, 2, 3, 6-tetrahydropyridine-lesioned macaque model of Parkinson's disease. Journal of Neuroscience, 21(17): 6853−6861. doi: 10.1523/JNEUROSCI.21-17-06853.2001
    [6] Caiola M, Pittard D, Wichmann T, Galvan A. 2019. Quantification of movement in normal and parkinsonian macaques using video analysis. Journal of Neuroscience Methods, 322: 96−102. doi: 10.1016/j.jneumeth.2019.05.001
    [7] Chen YC, Yu JH, Niu YY, Qin DD, Liu HL, Li G, et al. 2017. Modeling rett syndrome using TALEN-edited MECP2 mutant cynomolgus monkeys. Cell, 169(5): 945−955.E10. doi: 10.1016/j.cell.2017.04.035
    [8] Francisco FA, Nührenberg P, Jordan AL. 2019. A low-cost, open-source framework for tracking and behavioural analysis of animals in aquatic ecosystems. BioRxiv: 571232.
    [9] Gibbs RA, Rogers J, Katze MG, Bumgarner R, Weinstock GM, Mardis ER, et al. 2007. Evolutionary and biomedical insights from the rhesus macaque genome. Science, 316(5822): 222−234. doi: 10.1126/science.1139247
    [10] Gonzalez RC, Woods RE. 2002. Digital Image Processing. 2nd ed. Prentice Hall: Upper Saddle River.
    [11] Graving JM, Chae D, Naik H, Li L, Koger B, Costelloe BR, et al. 2019. DeepPoseKit, a software toolkit for fast and robust animal pose estimation using deep learning. Elife, 8: e47994. doi: 10.7554/eLife.47994
    [12] Hashimoto T, Izawa Y, Yokoyama H, Kato T, Moriizumi T. 1999. A new video/computer method to measure the amount of overall movement in experimental animals (two-dimensional object-difference method). Journal of Neuroscience Methods, 91(1-2): 115−122. doi: 10.1016/S0165-0270(99)00082-5
    [13] Hu GY, Cui B, Yu S. 2020a. Joint learning in the spatio-temporal and frequency domains for skeleton-based action recognition. IEEE Transactions on Multimedia, 22(9): 2207−2220. doi: 10.1109/TMM.2019.2953325
    [14] Hu GY, Cui B, He Y, Yu S. 2020b. Progressive relation learning for group activity recognition. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 977–986.
    [15] Jocher G. 2021. YOLOv5.https://github.com/ultralytics/yolov5.
    [16] Johansson G. 1973. Visual perception of biological motion and a model for its analysis. Perception & Psychophysics, 14(2): 201−211.
    [17] Krakauer JW, Ghazanfar AA, Gomez-Marin A, Maciver MA, Poeppel D. 2017. Neuroscience needs behavior: correcting a reductionist bias. Neuron, 93(3): 480−490. doi: 10.1016/j.neuron.2016.12.041
    [18] Lehner PN. 1987. Design and execution of animal behavior research: an overview. Journal of Animal Science, 65(5): 1213−1219. doi: 10.2527/jas1987.6551213x
    [19] Lind NM, Vinther M, Hemmingsen RP, Hansen AK. 2005. Validation of a digital video tracking system for recording pig locomotor behaviour. Journal of Neuroscience Methods, 143(2): 123−132. doi: 10.1016/j.jneumeth.2004.09.019
    [20] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, et al. 2016a. SSD: single shot multibox detector. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 21–37.
    [21] Liu Z, Li X, Zhang JT, Cai YJ, Cheng TL, Cheng C, et al. 2016b. Autism-like behaviours and germline transmission in transgenic monkeys overexpressing MeCP2. Nature, 530(7588): 98−102. doi: 10.1038/nature16533
    [22] Mathis A, Mamidanna P, Cury KM, Abe T, Murthy VN, Mathis MW, et al. 2018. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nature Neuroscience, 21(9): 1281−1289. doi: 10.1038/s41593-018-0209-y
    [23] Mathis A, Schneider S, Lauer J, Mathis MW. 2020. A primer on motion capture with deep learning: principles, pitfalls, and perspectives. Neuron, 108(1): 44−65. doi: 10.1016/j.neuron.2020.09.017
    [24] Nice M M. 1954. Reviewed work: The Herring Gull's World. A study of the social behaviour of birds by Niko Tinbergen. Bird-Banding, 25(2): 81−82. doi: 10.2307/4510469
    [25] Pandya JD, Grondin R, Yonutas HM, Haghnazar H, Gash DM, Zhang ZM, et al. 2015. Decreased mitochondrial bioenergetics and calcium buffering capacity in the basal ganglia correlates with motor deficits in a nonhuman primate model of aging. Neurobiology of Aging, 36(5): 1903−1913. doi: 10.1016/j.neurobiolaging.2015.01.018
    [26] Redmon J, Divvala S, Girshick R, Farhadi A. 2016. You only look once: Unified, real-time object detection. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 779–788.
    [27] Togasaki DM, Hsu A, Samant M, Farzan B, DeLanney LE, Langston JW, et al. 2005. The Webcam system: a simple, automated, computer-based video system for quantitative measurement of movement in nonhuman primates. Journal of Neuroscience Methods, 145(1-2): 159−166. doi: 10.1016/j.jneumeth.2004.12.010
    [28] Tzutalin. 2015. LabelImg.https://github.com/tzutalin/labelImg.
    [29] Ueno M, Hayashi H, Kabata R, Terada K, Yamada K. 2019. Automatically detecting and tracking free-ranging Japanese macaques in video recordings with deep learning and particle filters. Ethology, 125(5): 332−340. doi: 10.1111/eth.12851
    [30] Walton A, Branham A, Gash DM, Grondin R. 2006. Automated video analysis of age-related motor deficits in monkeys using EthoVision. Neurobiology of Aging, 27(10): 1477−1483. doi: 10.1016/j.neurobiolaging.2005.08.003
    [31] Wiltschko AB, Johnson MJ, Iurilli G, Peterson RE, Katon JM, Pashkovski SL, et al. 2015. Mapping sub-second structure in mouse behavior. Neuron, 88(6): 1121−1135. doi: 10.1016/j.neuron.2015.11.031
    [32] Wu Y, Lim J, Yang MH. 2013. Online object tracking: A benchmark. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2411–2418
    [33] Yabumoto T, Yoshida F, Miyauchi H, Baba K, Tsuda H, Ikenaka K, et al. 2019. MarmoDetector: a novel 3D automated system for the quantitative assessment of marmoset behavior. Journal of Neuroscience Methods, 322: 23−33. doi: 10.1016/j.jneumeth.2019.03.016
    [34] Yao Y, Jafarian Y, Park HS. 2019. MONET: multiview semi-supervised keypoint detection via epipolar divergence. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 753–762.
  • ZR-2021-353-Supplementary video.zip
  • 加载中
  • 文章访问数:  792
  • HTML全文浏览量:  351
  • PDF下载量:  160
  • 被引次数: 0
  • 收稿日期:  2022-01-19
  • 录用日期:  2022-03-17
  • 网络出版日期:  2022-03-17
  • 刊出日期:  2022-05-18